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REGULATION OF GENE EXPRESSION IN PLANTS 

This- invention relates to methods of modulating 
the expression of desired genes in plants, and to DNA 
sequences and genetic constructs for use in these methods. 
In particular, the invention relates to methods and 
constructs for targeting of expression specifically to the 
endosperm of the seeds of cereal plants such as wheat, and 
for modulating the time of expression in the target tissue. 
This is achieved by the use of promoter sequences from 
enzymes of the starch biosynthetic pathway. In a preferred 
embodiment of the invention, the sequences and/or promoters 
are those of starch branching enzyme I, starch branching 
enzyme II, soluble starch synthase I, and starch debranching 
enzyme, all derived from Triticum tauschii , the D genome 
donor of hexaploid bread wheat. 

A further preferred embodiment relates to a method 
of identifying variations in the characteristics of plants. 

BACKGROUND OF THE INVENTION 

Starch is an important constituent of cereal 
grains and of flours, accounting for about 65-67% of the 
weight of the grain at maturity. It is produced in the 
amyloplast of the grain endosperm by the concerted action of 
a number of enzymes, including ADP-Glucose pyrophosphoryiase 
(EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching 
enzymes (EC 2.4.1.18) and debranching enzymes (EC 3.2.1.41 
and EC 3.2.1.68) (Ball et al, 1996; Martin and Smith, 1995; 
Morell et al, 1995). Some of the proteins involved in the 
synthesis of starch can be recovered from the starch 
granule (Denyer et al , 1995; Rahman et al, 1995). 

Most wheat cultivars normally produce starch 
containing 25% amylose and 75% amylopectin. Amylose is 
composed of large linear chains of a (1-4) linked a-D- 
glucopyranosyl residues, whereas amylopectin is a branching 
form of a-glycan linked by a (1-6) linkages. The ratio of 
amylose and amylopectin, the branch chain length and the 
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number of branch chains of amylopectin are the major factors 
which determine the properties of wheat starch. 

Starch with various properties has been widely 
used in industry, food science and medical science. High 
5 amylose wheat can be used for plastic substitutes and in 
paper manufacture to protect the environment; in health 
foods to reduce bowel cancer and heart disease; and in 
sports foods to improve the athletes' performance. High 
amylopectin wheat may be suitable for Japanese noodles, and 

10 is used as a thickener in the food industry. 

Wheat contains three sets of chromosomes (A, B and 
D) in its very large genome of about 10 10 base pairs (bp) . 
The donor of the D genome to wheat is Triticum causchii , and 
by using a suitable accession of this species the genes from 

15 the D genome can be studied separately (Lagudah et al, 
1991) . 

There is comparatively little variation in starch 
structure found in wheat varieties, because the hexaploid 
nature of wheat prevents mutations from being readily 

20 identified. Dramatic alterations in starch structure are 

expected to require the combination of homozygous recessive 
alleles from each of the 3 wheat genomes, A, B and D. This 
requirement renders the probability of finding such mutants 
in natural or mutagenised populations of wheat very low. 

25 Variation in wheat starch is desirable in order to enable 
better tailoring of wheat starches for processing and end- 
user requirements . 

Key commercial targets for the manipulation of 
starch biosynthesis are: 

30 1. "Waxy 7 ' wheats in which amylose content is 

decreased to insignificant levels. This outcome is expected 
to be obtained by eliminating granule-bound starch synthase 
activity . 

2. High amylose wheats, expected to be obtained 
35 by suppressing starch branching enzyme-II activity. 

3. Wheats which continue to synthesise starch 
at elevated temperatures, expected to be obtained by 
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identifying or introducing a gene encoding a heat-stable 
soluble starch synthase. 

4. "Sugary types" of wheat which contain 
increased amylose content and free sugars, expected to be 
obtained by manipulating an isoamy lase- type debranching 
enzyme . 

There are two general strategies which may be used 
to obtain wheats with altered starch structure: 

(a) using genetic engineering strategies to 
suppress the activity of a specific gene, or to introduce a 
novel gene into a wheat line; and 

(b) selecting among existing variation in wheat for 
missing ("null") or altered alleles of a gene in 
each of the genomes of wheat, and combining 
these by plant breeding. 

However, in view of the complexity of the gene families, 
particularly starch branching enzyme I (SBE I), without the 
ability to target regions which are unique to genes 
expressed in endosperm, modification of wheat by combination 
of null alleles of several enzymes in general represents an 
almos t imposs ible task . 

Branching enzymes are involved in the production 
of glucose a-1,6 branches. Of the two main constituents of 
starch, amylose is essentially linear, but amylopectin is 
highly branched; thus branching enzymes are thought to be 
directly involved in the synthesis of amylopectin but not 
amylose. There are two types of branching enzymes in plants 
, starch branching enzyme I ( SBE I) and starch branching 
enzyme II (SBE II), and both are about 85 kDa in size. At 
the nucleic acid level there is about 65% sequence identity 
between types I and II in the central portion of the 
molecules ; the sequence identity between SBE I from 
different cereals is about 85% overall (Burton et al , 1995; 
Morell et al, 1995) . 

In cereals, SBE I genes have so far been reported 
only for rice (Kawasaki et al, 1991; Rahman et al, 1997) . A 
cDNA sequence for wheat SBE I is available on the GenBank 
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database (Accession No. Y12320; Repellin A., Nair R.B., Baga 
M. , and Chibbar R.N.: Plant Gene Register PGR97-094, 1997). 
As far as we are aware, no promoter sequence for wheat SBE I 
has been reported. 
5 We have characterised an SBE I gene, designated 

wSBE I-D2, from Triticum tauschii, the donor of the D genome 
to wheat (Rahman et al , 1997) . This gene encoded a protein 
sequence which had a deletion of approximately 65 amino 
acids at the C-terminal end, and appeared not to contain 

10 some of the conserved amino acid motifs characteristic of 
this class of enzyme (Svensson, 1994) . Although wSBE I-D2 
was expressed as mRNA, no corresponding protein has yet been 
found in our analysis of SBE I isoforms from the endosperm, 
and thus it is possible that this gene is a transcribed 

15 pseudogene. 

Genes for SBE II are less well characterised; no 
genomic sequences are available, although SBE II cDNAs from 
rice (Mizuno et al, 1993; Accession No. D16201) and maize 
(Fisher et al, 1993; Accession No. L08065) have been 

20 reported. In addition, a cDNA sequence for SBE II from 
wheat is available on the GenBank database (Nair et al, 
1997; Accession No. Y11282); although the sequences are very 
similar to those reported herein, there are differences near 
the N-terminal of the protein, which specifies its 

25 intracellular location. No promoter sequences have been 
reported, as far as we are aware. 

Wheat granule-bound starch synthase (GBSS) is 
responsible for amylose synthesis, while wheat branching 
enzymes together with soluble starch synthases are 

30 considered to be directly involved in amylopectin 

biosynthesis . A number of isoforms of soluble and granule- 
bound starch synthases have been identified in developing 
wheat endosperm (Denyer et al, 1995). There are three 
distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 

35 100-105 kDa, which exist in the starch granules (Denyer et 
al, 1995; Rahman et al, 1995). The 60 kDa GBSS is the 
product of the wx gene. The 75-77 kDa protein is a wheat 
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soluble starch synthase I (SSSI) which is present in both 
the soluble fraction and the starch granule-bound fraction 
of the endosperm. However, the 100-105 kDa proteins, which 
are another type of soluble starch synthase, are located 
5 only in starch granules (Denyer et al, 1995; Rahman et al , 
1995) . To our knowledge there has been no report of any 
complete wheat SSS I sequence, either at the protein or the 
nucleotide level . 

Both cDNA and genomic DNA encoding a soluble 

10 starch synthase I of rice have been cloned and analysed 

(Baba et al, 1993; Tanaka et al , 1995). The cDNAs encoding 
potato soluble starch synthase SSSII and SSSIII and pea 
soluble starch synthase SSSII have also been reported 
(Edwards et al, 1995; Marshall et al, 1996; Dry et al, 

15 1992). However, corresponding full length cDNA sequences for 
wheat have hitherto not been available, although a partial 
cDNA sequence (Accession No. U48227) has been released to 
the GenBank database. 

Approach (b) referred to above has been 

20 demonstrated for the gene for granule-bound starch synthase. 
Null alleles on chromosomes 7A, 7D and 4A were identified by 
the analysis of GBSS protein bands by electrophoresis, and 
combined by plant breeding to produce a wheat line 
containing no GBSS, and no amylose (Nakamura et al, 1995). 

25 Subsequently, PCR-based DNA markers have been identified, 
which also identify null alleles for the GBSS loci on each 
of the three wheat genomes. Despite the availability of a 
considerable amount of information in the prior art, major 
problems remain. Firstly, the presence of three separate 

3 0 sets of chromosomes in wheat makes genetic analysis in this 
species extraordinarily complex. This is further 
complicated by the fact that a number of enzymes are 
involved in starch synthesis, and each of these enzymes is 
itself present in a number of forms, and in a number of 

35 locations within the plant cell. Little, if any, 

information has been available as to which specific form of 
each enzyme is expressed in endosperm. For wheat, a limited 
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amount of nucleic acid sequence information is available, 
but this is only cDNA sequence; no genomic sequence, and 
consequently no information regarding promoters and other 
control sequences, is available. Without being able to 
5 demonstrate that the endosperm-specific gene within a family 
has been isolated, such sequence information is of limited 
practical usefulness . 



SUMMARY OF THE IIvJVENTION 

10 In this application we report the isolation and 

identification of novel genes from T. tauschii , the D-genome 
donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, 
and an isoamyiase- type debranching enzyme (DBE) . Because of 
the very close relationship between T. tauschii and wheat, 

15 as discussed above, results obtained with T. tauschii can be 
directly applied to wheat with little if any modification. 
Such modification as may be required represents routine 
trial and error experimentation. Sequences from these genes 
can be used as probes to identify null or altered alleles in 

20 wheat, which can then be used in plant breeding programmes 
to provide modifications of starch characteristics. The 
novel sequences of the invention can be used in genetic 
engineering strategies or to introduce a desired gene into a 
host plant, to provide antisense sequences for suppression 

2 5 of one or more specific genes in a host plant, in order to 

modify the characteristics of starch produced by the plant. 

By using T. tauschii, we have been able to examine 
a single genome, rather than three as in wheat, and to 
identify and isolate the forms of the starch synthesis genes 

3 0 which are expressed in endosperm. By addressing genomic 

sequences we have been able to isolate tissue-specific 
promoters for the relevant genes, which provides a mechanism 
for simultaneous manipulation of a number of genes in the 
endosperm. Because T, tauschii is so closely related to 
35 wheat, results obtained with this model system are directly 
applicable to wheat, and we have confirmed this 
experimentally. The genomic sequences which we have 
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determined can also be used as probes for the identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
5 a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

10 the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 

15 preferably the sequence is derived from a Triticum species, 
most preferably Triticum tauschii. 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 

20 Biologically-active untranslated control sequences 

of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 

25 invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 

30 sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector, preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacterium, preferably 

35 Agrobacterium tumefaciens. Methods of transforming cereal 
plants using Agrobacterium tumefaciens are known; see for 
example Australian Patent No. 667939 by Japan Tobacco Inc., 



8NSOOC1D: <WO_W1 431 4A1 JU> 



WO 99/14314 



PCT/AU98/00743 



International Patent Application Number PCT/US97 / 1062 1 by 
Monsanto Company and Tingay et al (1997) . 

In a second aspect, the invention provides a 
nucleic acid construct for targeting of a desired gene to 
5 endosperin of a cereal plant, and/or for modulating the time 
of expression of a desired gene in endosperm of a cereal 
plant, comprising one or more promoter sequences selected 
from SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 

10 encoding a desired protein, and optionally also operatively 
linked to one or more additional targeting sequences and/or 
one or more 3' untranslated sequences. 

The nucleic acid encoding the desired protein may 
be in either the sense orientation or in the antisense 

15 orientation. Preferably the desired protein is an enzyme of 
the starch biosynthetic pathway. For example, the antisense 
sequences of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, or grain softness protein I, may 
be used. Preferred sequences for use in sense orientation 

20 include those of bacterial isoamylase, bacterial glycogen 

synthase, or wheat high molecular weight glutenin Bxl7 . It 
is contemplated that any desired protein which is encoded by 
a gene which is capable of being expressed in the endosperm 
of a cereal plant is suitable for use in the invention. 

25 In a third aspect, the invention provides a method 

of modifying the characteristics of starch produced by a 
plant, comprising the step of: 

(a) introducing a gene encoding a desired enzyme 
of the starch biosynthetic pathway into a host plant, and/or 

30 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein said enzymes are as defined above. 

Where both steps (a) and (b) are used, the enzymes 

35 in the two steps are different. 

Preferably the plant is a cereal plant, more 
preferably wheat or barley. 
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As is well known in the art, anti-sense sequences 
can be used to suppress expression of the protein to which 
the anti-sense sequence is complementary. It will be 
evident to the person skilled in the art that different 
5 combinations of sense and anti-sense sequences may be chosen 
so as to effect a variety of different modifications of the 
characteristics of the starch produced by the plant. 

In a fourth aspect, the invention provides a 
method of targeting expression of a desired gene to the 
10 endosperm of a cereal plant, comprising the step of 

transforming the plant with a construct according to the 
invention . 

According to a fifth aspect, the invention 
provides a method of modulating the time of expression of a 
15 desired gene in endosperm of a cereal plant, comprising the 
step of transforming the plant with a construct according to 
the second aspect of the invention. 

Where expression at an early stage following 
anthesis is desired, the construct preferably comprises the 
20 SBE II, SSS I or DBE promoters. Where expression at a later 
stage following anthesis is desired, the construct 
preferably comprises the SBE I promoter. 

While the invention is described in detail in 
relation to wheat, it will be clearly understood that it is 
25 also applicable to other cereal plants of the family 
Gramineae, such as maize, barley and rice. 

Methods for transformation of monocotyledonous 
plants such as wheat, maize, barley and rice and for 
regeneration of plants from protoplasts or immature plant 
30 embryos are well known in the art. See for example Lazzeri 
et al, 1991; Jahne et al , 1991 and Wan and Lemaux, 1994 for 
barley; Wirtzens et al, 1997; Tingay et al , 1997; Canadian 
Patent Application No. 2092588 by Nehra; Australian Patent 
Application No. 61781/94 by National Research Council of 
35 Canada, Australian Patent No. 667939 by Japan Tobacco Co, 
and International Patent Application Number PCT/US97 / 10 62 1 
by Monsanto Company . 
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The sequences of ADP glucose pyrophosphory lase 
from barley (Australian Patent Application No. 65392/94), 
starch debranching enzyme and its promoter from rice 
(Japanese Patent Publication No. Kokai 6261787 and Japanese 
5 Patent Publication No. Kokai 5317057), and starch 

debranching enzyme from spinach and potato (Australian 
Patent Application No. 44333/96) are all known. 



Detailed Description of the Drawings 
10 The invention v/ill be described in detail by 

reference only to the following non-limiting examples and to 
the figures . 

Figure 1 shows the hybridisation of genomic clones 
isolated from T . tauschii . 

15 DNA was extracted from the different clones, 

digested with BamHI and hybridised with the 5' end of the 
maize SBE I cDNA . Lanes 1, 2, 3 and 4 correspond to DNA 
from clones A.E1 , A,E2 , A.E6 and XEl respectively. Note that 
clones XE1 and XE2 give identical patterns, the SBE I gene 

20 in A,E6 is a truncated form of that in XEl , and XEl gives a 
clearly different pattern. 

Figure 2 shows the hybridisation of DNA from 
T. tauschii. 

DNA from T. tauschii was digested with BamHI and 
25 the hybridisation pattern compared with DNA from XEl and A.E7 
digested with the same enzyme. Fragment El . 1 (see Figure 3) 
from XEl was used as the probe; it contains some sequences 
that are over 80% identical to sequences in E7 . 8 . 
Approximately 25 (_Lg of T. tauschii DNA was elec trophoresed 
3 0 in lane 1, and 2 00 pg each of XEl and XEl in lanes 2 and 3, 
respectively. 

Figure 3 shows the restriction maps of clone A.E1 
and XEl . The fragments obtained with EcoRI and BamHI are 
indicated. The fragments sequenced from >^E1 are El . 1 , El . 2 , 
35 a part of El . 7 and a part of El . 5 . 

Figure 4 shows the comparison of deduced amino 
acid sequence of wSBE I-D4 cDNA with the deduced amino acid 
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sequence of rice SBE I (RSBE I; Nakamura et al, 1992), maize 
SBE I (MSBE I; Baba et al , 1991), wSBE I-D2 type cDNA ( D2 
cDNA; Rahman et al , 1997), pea SBE II (PESBE II, homologous 
to maize SBE I; Burton et al, 1995), and potato SBE I 
5 (POSBE; Cangiano et al, 1993) . The deduced amino acid 
sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA" . 
Residues present in at least three of the sequences are 
identified in the consensus sequence in capitals. 

Figure 5 shows the intron-exon structure of 
10 wSBE I-D4 compared to the corresponding structures of rice 
SBE I (Kawasaki et al , 1993) and wSBE I-D2 (Rahman et al, 
1997). The intron-exon structure of wSBE I-D4 is deduced by 
comparison with the SBE I cDNA reported by Repellin et al 
(1997) . 

15 The dark rectangles correspond to exons and the 

light rectangles correspond to introns . The bars above the 
structures indicate the percentage identity in sequence 
between the indicated exons and introns of the relevant 
genes. Note that intron 2 shares no significant sequence 

20 identity and is not indicated. 

Figure 6 shows the nucleotide sequence of part of 
wSBE I-D4, the amino acid sequence deduced from this 
nucleotide sequence, and the N-terminal amino acid sequence 
of the SBE I purified from the wheat endosperm (Morell et 

25 al, 1997) . 

Figure 7 shows the hybridisation of SBE I genomic 
clones with the following probes, 

A. wSBE I-D45 (derived from the 5' end of the 
gene and including sequence from fragments El . 1 and El . 7 ) , 

3 0 and 

B. wSBE I-D43 (derived from the 3' end of the 
gene and containing sequences from fragment El. 5) . For 
panel A, the tracks 1-13 correspond to clones A.E1, XE2 , A,E6 , 
\E7 , A.E9, XE14, XE22, X.E27 , Molecular weight markers, A.E29, 

35 A.E3 0, A.E31 and XE52 . For panel B, tracks 1-12 correspond to 
clones A.E1, XE2 , XES , XE1 , XE9 , A.E14, A.E22, XE27 , A.E29, 
\E30, XE31 and X.E52 . Note that clones \E1 and XE22 do not 
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hybridise to either of the probes and are wSBE I-D2 type 
genes. Also note that clone A.E30 contains a sequence 
unrelated to SBE I. The size of the molecular weight 
markers in kb is indicated. Clones A.E7 and XE22 do 
5 hybridise with a probe from El . 1 . which is highly conserved 
between wSBE I-D2 and wSBE I-D4. 

Figure 8 shows the alignment of cDNA clones to 
obtain the sequence represented by wSBE I-D4 cDNA. BED4 and 
BEDS were obtained from screening the cDNA library with 

10 maize BEI (Baba et al , 1991). BED1 , 2 and 3 were obtained 
by RT-PCR using defined primers. 

Figure 9a shows the expression of Soluble Starch 
Synthase I (SSS) , Starch Branching Enzyme I (BE I) and 
Starch Branching Enzyme II (BE II) mRNAs during endosperm 

15 development. 

RNA was purified from leaves, florets prior to 
anthesis, and endosperm of wheat cultivar Rosella grown in a 
glasshouse, collected 5 to 8 days after anthesis, 10 to 15 
days after anthesis and 18 to 22 days after anthesis, and 

20 from the endosperm of wheat cultivar Rosella grown in the 
field and collected 12, 15 and 18 days after anthesis 
respectively. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes were from the 
coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 

25 1919 of the SM2 cDNA sequence); wSBE I-D43C (see Table I), 
which corresponds to the untranslated 3 f end of wSBE I-D4 
cDNA (El (3 r ; and the 5* region of SBE9 (SBE9 (5'), 
corresponding to the region between nucleotides 743 to 1004 
of Genbank sequence Y11282 . No hybridisation to RNA 

30 extracted from leaves or preanthesis florets was detected. 

Figure 9b shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
the starch branching enzyme I gene . The probe, WSBEI-D43 , is 
defined in Table 1. 

3 5 Figure 9 c shows the hybridisation of RNA from the 

endosperm of the hexaploid T . aestivum cultivar "Wyuna" with 
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the starch branching enzyme II gene. The probe, wSBE II-D13, 
is defined in Table 2. 

Figure 9d shows the hybridisation of RNA from the 
endosperm of the hexaploid T . aestixrum cultivar "Gabo" with 
5 the SSS I gene. The probe spanned the region from 

nucleotides 2025 to 2497 of the SM2 cDNA sequence shown in 
SEQ ID No : 11 . 

Figure 9e shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
10 the DBE I gene. The probe, a DBE3 ' 3 ' PCR fragment, extends 

from nucleotide position 281 to 1072 of the cDNA sequence in 
SEQ ID No : 16 . 

Figure 9f shows the hybridisation of RNA from the 
endosperm of the hexaploid T . aestivum cultivar "Gabo" with 
15 the wheat actin gene. The probe was a wheat actin DNA 

sequence generated by PCR from wheat endosperm cDNA using 
primers to conserved plant actin sequences . 

Figure 9g shows the hybridisation of RNA from the 
endosperm of the hexaploid T . aestivum cultivar "Gabo" with 
20 a probe containing wheat ribosomal RNA 26S and 18S fragments 
(plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant 
Industry) . 

Figure 9h shows the hybridisation of RNA from the 
hexaploid wheat cultivar "Gabo" with the DBE I probe 

25 described in Figure 9e. Lane 1; leaf RNA; lane 2, pre- 

anthesis floret RNA; lane 3, RNA from endosperm harvested 12 
days after anthesis. 

Figure 10 shows the comparison of wSBE I-D4 
(sr 427. res ck: 6,362,1 to 11,099) and rice SBE I genomic 

30 sequence (dl0838 . em__pl ck: 3,071,1 to 11 , 700 ) (Kawasaki et 
al, 1993; Accession Number D10838) using the programs 
Compares and DotPlot (Devereaux et al, 1984). The programs 
used a window of 21 bases with a stringency of 14 to 
register a dot. 

3 5 Figure 11 shows the hybridisation of wheat DNA 

from chromosome-engineered lines using the following probes: 

A. wSBE I-D45 (from the 5' end of the gene), 
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B. wSBE I-D43 (from the 3' end of the gene), 

and 

C. wSBE I-D4R (repetitive sequence 
approximately 600 bp 3' to the end of wSBE I-D4 sequence. 

5 N7AT7B, no 7A chromosome, four copies of 7B 

chromosome; N7BT7D, no 7B chromosome, four copies of 7D 
chromosome; NTDT7 A , no 7D chromosome, four copies of 7A 
chromosome- The chromosomal origin of hybridising bands is 
indicated . 

10 Figure 12 shows the hybridisation of genomic 

clones Fl, F2 , F3 and F4 with the entire SBE-9 sequence. 
The DNA from the clones was purified and digested with 
either BamHI or EcoRI , separated on agarose, blotted onto 
nitrocellulose and hybridised with labelled SBE-9 (a SBE II 

15 type cDNA) . The pattern of hybridising bands is different 
in the four isolates. 

Figure 13a shows the N-terminal sequence of 
purified SBE II from wheat endosperm as in Morell et al, 
(1997) . 

20 Figure 13b shows the deduced amino acid sequence 

from part of wSBE II-D1 that encodes the N-terminal sequence 
as described in Morell et al, (1997) . 

Figure 14 shows the deduced exon-intron structure 
for a part of wSBE II-D1. The scale is marked in bases. 
25 The dark rectangles are exons . 

Figure 15 shows the hybridisation of DNA from 
chromosome engineered lines of wheat (cultivar Chinese 
Spring) with a probe from nucleotides 550-850 from SBE-9. 
The band of approximately 2.2 kb is missing in the line in 
3 0 which chromosome 2D is absent. 

T2BN2A: four copies of chromosome 2B, no copies 
of chromosome 2A ; 

T2AN2B: four copies of chromosome 2A, no copies 
of chromosome 2B; 
3 5 T2AN2D: four copies of chromosome 2A, no copies 

of chromosome 2D . 
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Figure 16 shows the N-terminal sequence of SSS I 
protein isolated from starch granules (Rahman et al, 1995) 
and deduced amino acid sequence of part of Sm2 . 

Figure 17 shows the hybridisation of genomic 
5 clones sgl , 3, 4, 6 and 11 with the cDNA clone ( sm2 ) for SSS 
I. DMA was purified from indicated genomic clones, digested 
with BamHX or SacI and hybridised to sm2 . Note that the 
hybridisation patterns for sgl, 3 and 4 are clearly 
different from each other. 
10 Figure 18 shows a comparison of the intron/exon 

structures of the wheat and rice soluble starch synthase 
genomic sequences. The dark rectangles indicate exons and 
the light rectangles represent mtrons. 

Figure 19 shows the hybridisation of DNA from 
15 chromosome engineered lines of wheat (cultivar Chinese 
Spring) digested with PvxzII, with the sm2 probe. 

N7AT7B: no 7A chromosome, four copies of 7B 
chromosome ; 

N7BT7D: no 7B chromosome, four copies of 7D 

2 0 chromosome; 

N7DT7A: no 7D chromosome, four copies of 7A 

chromosome . 

A band is missing in the N7BT7A line. 
Figure 20a shows the DNA sequence of a portion of 
25 the wheat debranching enzyme (WDBE-l)PCR product. The 

PCR product was generated from wheat genomic DNA (cultivar 
Rosella) using primers based on sequences conserved in 
debranching enzymes from maize and rice. 

Figure 20b shows a comparison of the nucleotide 
30 sequence of wheat debranching enzyme I (WDBE-I) PCR fragment 
(WHEAT. DNA) with the maize Sugary-1 sequence ( SUGARY . DNA) . 

Figure 20c shows a comparison between the 
intron/exon structures of wheat debranching enzyme gene and 
the maize sugary-1 debranching enzyme gene. 
35 Figure 21a shows the results of Southern blotting 

of T. tauschii DNA with wheat DBE-I PCR product. DNA from 
T. tauschii was digested with BamHI , electrophoresed, 
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blotted and hybridised to the wheat DBE-I PCR product 
described in Figure 20a. A band of approximately 2 kb 
hybridised . 

Figure 21b shows Chinese Spring nullisomic/ 
5 tetrasomic lines probed with probes from the DBE gene. Panel 

(I) shows hybridisation with a fragment spanning the region 
from nucleotide 270 to 465 of the cDNA sequence shown in SEQ 
ID No: 16 from the central region of the DBE gene. Panel 

(II) shows hybridisation with a probe from the 3' region of 
10 the gene, from nucleotide 281 to 1072 of the cDNA sequence 

given in SEQ ID No: 16. 

Figures 22a to 22e show diagrammatic 
representations of the DNA vectors used for transient 
expression analysis. In each of the sequences the N-terminal 
15 methionine encoding ATG codon is shown in bold. 

Figure 22a shows a DNA construct pwssslprolgf pNOT 
containing a 1042 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIprol, from -1042 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 
2 0 reporter gene. 

Figure 22b shows a DNA construct pwsssIpro2gf pNOT 
containing a 3914 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIpro2, from -3 914 to -1, SEQ 
ID No; 18) fused to the green fluorescent protein (GFP) 

2 5 reporter gene. 

Figure 22c shows a DNA construct psbellprolgf pNOT 
containing an 1203 base pair region of the wheat starch 
branching enzyme II promoter {sbellprol, from 1 to 1023 SEQ 
ID No: 10 fused to the green fluorescent protein (GFP) 

3 0 reporter gene. 

Figure 22d shows a DNA construct psbeIIpro2gf pNOT 
containing a 1353 base pair region of the wheat starch 
branching enzyme II promoter and transit peptide coding 
region (sbeIIpro2, regions 1-1203, 1204 to 1336 and 1664 to 
35 1680 of SEQ ID No: 10 fused to the green fluorescent protein 
(GFP) reporter gene. 

Figure 22e shows a DNA construct pac t_j sgf g_ nos 
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containing the piasmid backbone of pSP72 (Promega), the rice 
ActI actin promoter {McElroy et al . 1991), the GFP gene 
(Sheen et al . 1995) and the Agrobacterium tumefaciens 
nopaline synthase (nos) terminator (Bevan et al. 1983). 
5 Figure 23 shows T DNA constructs for stable 

transformation of rice by Agrobacterium . The backbone for 
each piasmid is p35SH-iC (Wang et al 1997). The various 
promoter-GFP-Nos regions inserted are shown in (a), (b) , (c) 
and (d) respectively, and are described in detail in Example 

10 24. Each of these constructs was inserted into the NotI 

site of p35SH-iC using the NotI flanking sites at each end 
of the promoter-GFP-Nos regions. The constructs were named 
(a) p3 5SH-iC-BEIIprol_GFP_Nos , (b) p3 5SH- iC-BEIIpro2_GFP_Nos 
(c) p35SH-iC-SSIprol_GFP_Nos and (d) p35SH-iC- 

15 SSIpro2_GFP_Nos 

Figure 24 illustrates the design of 15 intron- 
spanning BE II primer sets. Primers were based on 
wSBE II-D1 sequence (SEQ ID No:10), and were designed such 
that intron sequences in the wSBE II-DI sequence (deduced 

20 from Figure 13b and Nair et al, 1997; Accession No. Y11282) 
were amplified by PCR. 

Figure 25 shows the results of amplification using 
the SBE II-Intron 5 primer set (primer set 6: sr913F and 
WBE2E6 R) on various diploid, tetraploid and hexaploid 

2 5 wheats . 

i ) T. boeodicum (A genome diploid) 
ii) T . tauschii (D genome diploid) 

iii ) T. aesti \rum cv . Chinese Spring ditelosomic line 
2 AS (lacking chromosome arm 2 AL ) 
30 iv) Crete 10 (AABB tetraploid) 

v) T . aestivum cv Rosella (hexaploid) 
The horizontal axis indicates the size of the 
product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
35 of different genomes: A, A genome, B, B genome, D, D genome, 
U, unassigned additional product. 
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Figure 26 shows the results obtained by 
amplification using the SBE II-Intron 10 primer set (primer 
set 11: da5.seq and WBE2E11R on the wheat lines: 

(i) T . aestivum cv. Chinese Spring ditelosomic line 
5 2 AS . 

(ii) T. aestivum Chinese Spring 
nullisomic/ tetrasomic line N2BT2 A . 

(iii) T. aestivum Chinese Spring 
nullisomic/ tetrasomic line N2DT2B. 

10 The horizontal axis indicates the size of the 

product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
of different genomes: A, A genome, B, B genome, D, D genome. 

Figure 27 shows the results of transient 

15 expression assays typical of each promoter and target 
tissue. The photographs (40 x magnification) of 
representative tissue resulting from the transient 
expression assays typical of each promoter and target tissue 
revealed under a Leica microscope with blue light 

20 illumination. Photographs were taken 48 to 72 hours after 
tissue bombardment. The promoter constructs are listed as 
follows, (with the panels showing endosperm, embryo and leaf 
expression listed in respective order): pact_j sgf p_nos 
(panels a,g and m) ; pwsssiproigf pNOT (panels b, h and n) ; 

25 pwsssIpro2gf pNOT (panels c, i and o) ; psbellprolgf pNOT 

(panels d, j and p) ; psbeIIpro2gf pNOT (panels e, k and q) ; 
pZLgfpNOT (Panels f , 1 and r) . 



Example 1 Identification of Gene Encoding SBE I 

3 0 Construction of Genomic Library and Isolation of Clones 

The genomic library used in this study was 
constructed from Triticum tauschii, var. strangulata, 
accession number CPI 100799. Of all the accessions of 
T . tauschii surveyed, the genome of CPI 100799 is the most 
3 5 closely related to the D genome of hexaploid wheat. 



BN8DOC1D: <WO_W14314M JL> 



WO 99/14314 



- 19 - 



PCT/AU98/00743 



Tri ticum tauschii , var s trangulata (CPI accession 
number 110799) was kindly provided by Dr E Lagudah . Leaves 
were isolated from plants grown in the glasshouse . 

DNA was extracted from leaves of Tri ticum tauschii 
5 using published methods (Lagudah et al, 1991), partially 
digested with Sau3A f size fractionated and ligated to the 
arms of lambda GEM 12 (Promega) . The ligated products were 
used to transfect the me thylat ion- tolerant strain PMC 103 
(Doherty et al . 1992) . A total of 2 x 10 primary plaques 
10 were obtained with an average insert size of about 15 kb . 

Thus the 1 ibrary contains approximately 6 genomes worth o f 
T . tauschii DNA. The library was amplified and stored at 
4°C until required. 

Positive plaques in the genomic library were 
15 selected as those hybridising with the 5' end of a maize 
starch branching enzyme I cDNA ( Baba et al , 1991) using 
moderately stringent conditions as described in Rahman et 
al, (1997) . 

2 0 Preparation of Total RNA from Wheat 

Total RNA was isolated from leaves, pre-anthesis 
pericarp and di f f erent developmental s tages o f wheat 
endosperm of the cultivar, Hartog and Rosella. This 
material was collected from both the glasshouse and the 

2 5 field . The method used for RNA isolation was essentially 

the same as that described by Higgins et al (1976). RNA was 
then quantified by UV absorption and by separation in 
1 . 4% agarose- formaldehyde gels which were then visualized 
under UV light after staining with ethidium bromide 

30 (Sambrook et al , 1989). 

DNA and RNA analysis 

DNA was isolated and analysed using established 
protocols (Sambrook et al , 1989) . DNA was extracted from 
35 wheat (cv. Chinese Spring) using published methods (Lagudah 
et al, 1991) . Southern analysis was performed essentially 
as described by Jolly et al (1996) . Briefly, 20 jag wheat 
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DNA was digested, electrophoresed and transferred to a nylon 
membrane. Hybridisation was conducted at 42°C in 25% or 
50% formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the 
membrane was washed at 60°C in 2 x SSC for 3 x lh unless 
5 otherwise indicated. Hybridisation was detected by 
autoradiography using Fuji X-Omat film. 

RNA analysis was performed as follows. 10 \ig of 
total RNA was separated in a 1.4% agarose- formaldehyde gel 
and transferred to a nylon Hybond N + membrane (Sambrook et 

10 al, 1989 ), and hybridized with cDNA probe at 42 °C in 

Khandjian hybridizing buffer (Khandjian, 1989). The 3* part 
of wheat SBE I cDNA (designated wSBE I-D43, see Table 1) was 
labelled with the Rapid Multiprime DNA Probe Labelling Kit 
(Amersham) and used as probe. After washing at 60°C with 

15 2 x SSC, 0.1% SDS three times, each time for about 1 to 

2 hours, the membrane was visualized by overnight exposure 
at -80°C with X-ray film, Kodak MR. 

Example 2 Frequency of Recovery of SBE I Type Clones 

2 0 from the Genomic Library 

g 

An estimated 2 x 10 plaques from the amplified 
library were screened using an EcoRI fragment that contained 
1200 bp at the 5' end of maize SBE 1 (Baba et al, 1991) and 
twelve independent isolates were recovered and purified. 
25 This corresponds to the screening of somewhat fewer than the 
2 x 10 primary plaques that exist in the original library 
(each of which has an average insert size of 15 kb) 
(Maniatis et al, 1982), because the amplification may lead 
to the representation of some sequences more than others . 

3 0 Assuming that the amplified library contains approximately 

three genomes of T. tauschii, the frequency with which 
SBE I-positive clones were recovered suggests the existence 
of about 5 copies of SBE I type genes within the T. tauschii 
genome . 

35 Digestion of DNA from the twelve independent 

isolates by the restriction endonuclease BawHI followed by 
hybridisation with a maize SBE I clone, suggested that the 
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genomic clones could be separated into two broad classes 
(Figure 1) . One class had 10 members and a representative 
from this class is the clone XEl (Figure 1, lane 1); XE6 
(Figure 1, lane 3) is a member of this class, but is missing 
5 the 5* end of the El-SBE I gene because the SBE I gene is at 
the extremity of the cloned DNA . Further hybridisation 
studies at high stringency with the extreme 5' and 

3 ' regions of the SBE I gene contained in XEl suggested that 
the other clones contained either identical or very closely 

10 related genes. 

The second family had two members, and of these 
clone XE7 (Figure 1, lane 4) was arbitrarily selected for 
further study. These two members did not hybridise to 
probes from the extreme 5 T and 3' regions of the SBE I gene 

15 that were contained in A.E1, indicating that they were a 
distinct sub-class . 

The DNA from T. tauschii and the lambda clones A,E1 
and A.E7 was digested with BajnHI and hybridised with 
fragment El.l, as shown in Figure 2. This fragment contains 

20 sequences that are highly conserved (85% sequence identity 
over 0.3 kB between \El and XEl ) , corresponding to exons 3, 

4 and 5 of the rice gene. The bands in the genomic DNA at 
0.8 kb and 1.0 kb correspond to identical sized fragments 
from A.E1 and A.E7 , as shown in Figure 2; these are 

25 fragments El.l and E7 . 8 of A.E1 and XEl genomic clones 

respectively. Thus the arrangement of genes in the genomic 
clones is unlikely to be an artefact of the cloning 
procedure. There are also bands in the genomic DNA of 
approximately 2.5 kb, 4.8 kb and 8 kb in size which are not 

30 found from the digestion of XEl or X.E7 ; these could 

represent genes such as the 5' sequences of wSBE I-Dl or 
wSBE I-D3; see below. 



Example 3 Tandem Arrangement of SBE I Type Genes in 

3 5 the T. tauschii Genome 

Basic restriction endonuclease maps for XEl and 
XEl are shown in Figure 3 . The map was constructed by 
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performing a series of hybridisations of EcoRI or BairiHI 
digested DNA from A.E1 or ^E7 . The probes used were the 
fragments generated from BasnHI digestion of the relevant 
clone. Confirmation of the maps was obtained by PCR 
5 analysis, using primers both within the insert and also from 
the arms of lambda itself. PCR was performed in 10 \il 
volume using reagents supplied by Perkin-Elmer . The primers 
were used at a concentration of 20 p,M. The program used was 
94°C, 2 min, 1 cycle, then 94°C, 30 sec; 55°C, 30 sec; 72°C, 

10 Imin for 36 cycles and then 72°C, 5 min; 25°C, 1 min. 

Sequencing was performed on an ABI sequencer using 
the manufacturer's recommended protocols for both dye primer 
and dye terminator technologies . Deletions were carried out 
using the Erase-a-base kit from Promega . 

15 Sequence analysis was carried out using the GCG 

version 7 package of computer programs (Devereaux et al, 
1984) . 

The PCR products were also used as hybridisation 
probes. The positioning of the genes was derived from 
20 sequencing the ends of the BamHI subclones and also from 

sequencing PCR products generated from primers based on the 
insert and the lambda arms. The results indicate that there 
is only a single copy of a SBE I type gene within \E1 . 
However, it is clear that XE1 resulted from the cloning of a 

2 5 DNA fragment from within a tandem array of the SBE I type 

genes. Of the three genes in the clone, which are named as 
wSBE I-Dl, wSBE I-D2 and wSBE I-D3); only the central one 
(wSBE I-D2) is complete. 

3 0 Example 4 Construction and Screening of cDNA Library 

A wheat cDNA library was constructed from the 
cultivar Rosella using pooled RNA from endosperm at 8, 12, 
18 and 20 days after anthesis. 

The cDNA library was prepared from poly A + RNA 
3 5 that was extracted from developing wheat grains (cv. 

Rosella, a hexaploid soft wheat cultivar) at 8, 12, 15, 18, 
21 and 30 days after anthesis. The RNA was pooled and used 
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to synthesise cDNA that was propagated in lambda ZapII 
( Stratagene ) . 

The library was screened with a genomic fragment 
from encompassing exons 3 , 4 and 5 ( fragment E7 . 8 in 

5 Figure 3) . A number of clones were isolated. Of these an 
apparently full-length clone appeared to encode an unusual 
type of cDNA for SBE I. This cDNA has been termed SBE I-D2 
type cDNA. The putative protein product is compared with 
the maize SBE I and rice SBE I type deduced amino acid 

10 sequences in Figure 4. The main difference is that this 

putative protein product is shorter at the C-terminal end, 
with an estimated molecular size of approximately 74 kD 
compared with 85 kDa for rice SBE I (Kawasaki efc ai , 1993). 
Note that amino acids corresponding to exon 9 of rice are 

15 missing in SBE I-D2 type cDNA , but those corresponding to 
exon 10 are present. There are no amino acid residues 
corresponding to exons 11-14 of rice; furthermore , the 
sequence corresponding to the last 57 amino acids of 
SBE I-D2 type has no significant homology to the sequence of 

2 0 the rice gene . 

We expressed SBE I-D2 type cDNA in E. coli in 
order to examine its function. The cDNA was expressed as a 
fusion protein with 22 N-terminal residues of p-galacto- 
sidase and two threonine residues followed by the SBE I-D2 

25 cDNA sequence ei ther in or out of frame . Although an 

expected produc t of about 7 5 kDa in size was produced from 
only the in-frame fusion, we could not detect any enzyme 
activity from crude extracts of E. coli protein. 
Furthermore the in-frame construct could not complement an 

30 E. coli strain with a defined deletion in glycogen 

branching, although other putative branching enzyme cDNAs 
have been shown to be functional by this assay (data not 
shown) . It is therefore unclear whether the wSBE I-D2 gene 
in XE1 codes for an active enzyme in vivo. 

35 
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Example 5 Gene Structure in E7 

i. Sequence of wSBE I-D2 

We sequenced 9.2 kb of DNA that contained 
wSBE I-D2. This corresponds to fragments 7.31, 7,8 and 
5 7.18. Fragment 7,31 was sequenced in its entirety (4.1 kb) , 
but the sequence of about 3Q bases about 2 kb upstream of 
the start of the gene could not be obtained because it was 
composed entirely of Gs . Elevation of the temperature of 
sequencing did not overcome this problem. Fragments 7.8 

10 (1 kb) and 7.18 (4 kb) were completely sequenced, and 

corresponded to 2 kb downstream of the last exon detected 
for this gene. It was clear that we had isolated a gene 
which was closely related (approximately 95% sequence 
identity) to the SBE I-D2 type cDNA referred to above, 

15 except that the last 200 bp at the 3' end of the cDNA are 
not present. The wSBE I-D2 gene includes sequences 
corresponding to rice exon 11 which are not in the cDNA 
clone. In addition it does not have exons 9, 12, 13 or 14; 
these are also absent from the SBE I-D2 type cDNA . The 

20 first two exons show lower identity to the corresponding 

exons from rice {approximately 60%) (Kawasaki et al, 1993) 
than to the other exons (about 80%) . A diagrammatic exon- 
intron structure of the wSBE I-D2 gene is indicated in 
Figure 5. The restriction map was confirmed by sequencing 

25 the PCR products that spanned fragments 7.18 and 7.8 and 7.8 
and E7.31 (see Figure 3) respectively. 

ii. Sequence of wSBE I-D3 

This gene was not sequenced in detail, as the 
30 genomic clone did not extend far enough to include the 5' 

end of the sequence. The sequence is of a SBE-I type. The 
orientation of the gene is evident from sequencing of the 
relevant BamHI fragments, and was confirmed by sequence 
analysis of a PCR product generated using primers from the 
3 5 right arm of lambda and a primer from the middle of the 

gene. The sequence homology with WSBEI-D2 is about 80% over 
the regions examined. The 2 kb sequenced corresponded to 
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exons 5 and 6 of the rice gene; these sequences were 
obtained by sequencing the ends of fragments 7.5, 7.4 and 
7.14 respectively, although the sequences from the left end 
of fragment 7.14 did not show any homology to the rice 
5 sequences. The gene does not appear to share the 3' end of 
SBE I-D2 type cDNA , as a probe from 500 bp at the 3' end of 
the cDNA (including sequences corresponding to exons 8 and 
10 from rice) did not hybridise to fragment 7.14, although 
it hybridised to fragment 7.18. 

10 

iii- Sequence of wSBE I-Dl 

This gene was also not sequenced in detail, as it 
was clear that the genomic clone did not extend far enough 
to include the 5' sequences. Limited sequencing suggests 

15 that it is also a SBE I type gene. The orientation relative 
to the left arm of lambda was confirmed by sequencing a PCR 
product that used a primer from the left arm of lambda and 
one from the middle of the gene (as above) . Its sequence 
homology with wSBE I-D2 , D3 and D4 (see below) is about 75% 

20 in the region sequenced corresponding to a part of exon 4 of 

the rice gene. 

Starch branching enzymes are members of the a- 
amylase protein family, and in a recent survey Svensson 
(1994) identified eight residues in this family that are 

25 invariant, seven in the catalytic site and a glycine in a 
short turn. Of the seven catalytic residues, four are 
changed in SBE I-D2 type. However, additional variation in 
the x conserved' residues may come to light when more plant 
cDNAs for branching enzyme I are available for analysis. In 

30 addition, although exons 9, 11, 12, 13 and 14 from rice are 
not present in the SBE I-D2 type cDNA , comparison of the 
maize and rice SBE I sequences indicate that the 3 ' region 
(from amino acid residue 730 of maize) is much more variable 
than the 5' and central regions. The active sites of rice 

35 and maize SBE I sequences, as indicated by Svensson (1994), 
are encoded by sequences that are in the central portion of 
the gene. When SBE II sequences from Arabidopsis were 
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compared by Fisher et al (1996) they also found variation at 
the 3' and 5' ends. SBE I-D2 type cDNA may encode a novel 
type of branching enzyme whose activity is not adequately 
detected in the current assays for detecting branching 
5 enzyme activity; alternatively the cDNA may correspond to an 
endosperm mRNA that does not produce a functional protein. 

Example 6 Cloning of the cDNA corresponding to the 

wSBE I-D4 gene 

10 The first strand cDNAs were synthesized from 1 |ig 

of total RNA, derived from endosperm 12 days after 
pollination, as described by Sambrook et al (1989), and then 
used as templates to amplify two specific cDNA regions of 
wheat SBE I by PCR. 

15 Two pairs of primers were used to obtain the cDNA 

clones BED1 and BED3 (Table 1) . Primers used for cloning of 
BED3 were the degenerate primer NTS5 ' 

5' GGC NAC NGC NGA G/AGA C/TGG 3' (SEQ ID NO . 1 } , 

20 

based on the N-terminal sequence of the purified 
wheat endosperm SBE I protein, in which the 5' end of the 
primer is at position 168 of wSBE I-D4 cDNA, as shown in 
Table 1, based on the N-terminal seauence of wheat SBE I, 

2 5 and the primer NTS 3 ' . 

5' TAC ATT TCC TTG TCC ATCA 3 r { SEQ ID NO . 2 ) 

in which the 5 1 end is at position 1590 of 
30 wSBE I-D4 cDNA , (see Table 1), designed to anneal to the 
conserved regions of the nucleotide sequences of BED5 and 
the maize and rice SBE I cDNAs . For clone BED1 , the 
primers used were BEC5 ' 

3 5 5' ATC ACG AGA GCT TGC TCA {SEQ ID NO . 3 ) 
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in which the 5' end is at position 1 of wSBE I-D4 
cDNA (see Table 1); the sequence was based on the wSBE I-D4 
gene, and BEC3 ' 

5 5' CGG TAC AC A GTT GCG TCA TTT TC 3* ( SEQ ID NO . 4 ) 

in which the 5' end is at position 334 of 
wSBE I-D4 cDNA (see Table 1), and the sequence was based on 
BED 3 . 

10 

Example 7 Identification of the gene from the Tricicum 

tauschii SBE I family which is expressed in 
the endosperm 
We have isolated two classes of SBE I genomic 
15 clones from T . tauschii. One class contained two genomic 
clone isolates, and this class has been characterised in 
some detail (Rahman et al, 1997) . The complete gene 
contained within this class of clones was termed wSBE I-D2; 
there were additional genes at either ends of the clone, and 
20 these were designated wSBE I-Dl and wSBE I-D3 . The other 
class contained nine genomic clone isolates. Of these A.E1 
was arbitrarily taken as a representative clone, and its 
restriction map is shown in Figure 3 ; the SBE I gene 
contained in this clone was called wSBE I-D4. 

2 5 Fragments El . 1 (0.8 kb) and El . 2 (2.1 kb) and 

fragments El . 7 (4.8 kb) and El . 5 (3 kb) respectively were 
completely sequenced. Fragment El . 7 was found to encode the 
N-terminal of the SBE I, which is found in the endosperm as 
described in Morell et al (1997) . This is shown in 
30 Figure 6. Using antibodies raised against the N-terminal 
sequence, Morell et al (1997) found that the D genome 
isoform was the most highly expressed in the cultivars 
Rosella and Chinese Spring. We have thus isolated from 
T. tauschii a gene, wSBE I-D4, whose homologue in the 

3 5 hexaploid wheat genome encodes the major isoform for SBE I 

that is found in the wheat endosperm. 
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Table 1 

Location of structural features and probes within wSBE I-D4 

sequence * 



5 


A . Location of exons by 


comparison 


with the cDNA 


sequence o 




Repellin et al . , (1997) . 


Accession 


number Y12320 . 






Exon number 


Start posn 


End posn 




10 


1 


4890 


4987 






z 


c n o n 


b 149 






3 


5524 


5731 






4 


5 819 


5888 






5 


6149 


6318 




15 


6 


6519 


7424 






1 


■"7 ""7 A A 

1 /44 


78 60 






8 


8015 


8077 






9 


8562 


8670 






10 


9137 


9237 




20 


11 


9421 


9488 






12 


9580 


9661 






13 


9781 


9897 






14 


9990 


10480 




25 


B. Other features. 










Name of feature. 




wSBE I-D4. 
sequence 


D4 cDNA 
sequence . 


30 


Putative initiation of translation 


4900 


11 




Mature N-terminal sequence of SBE I 


5550 


124 




End of translated SBE I 


sequence 


10225 


2431 




End of D4 cDNA sequence 




10461 


2687 




wSBE I-D4 5 




4870, 5860 


1, 354 


35 


wSBE I-D43 




10116, 10435 


2338, 2657 




El. 1 




5680 , 6400 


380, 630 




BED 1 






1, 354 




BED 2 






169, 418 




BED 3 






151, 1601 


40 


BED 4 
BED 5 






867, 2372 
867, 2687 




Endosperm box like motif 


TGAAAAGT 


4480, 590 






CAAAT motif 




4863 






TAT AAA motif 




4833 
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All nine genomic clones of the XE1 type isolated 
from T. tauschii appear to contain the wSBE I-D4 gene, or 
very similar genes, on the basis of PCR amplification and 
hybridisation experiments. However, the restriction 
5 patterns obtained for the clones differ with BamHI and 
EcoRI, among other enzymes, indicating that either the 
clones represent near- identical but distinct genes or they 
represent the same gene isolated in distinct products of the 
Sau3A digest used to generate the library. 

10 

Example 8 Investigation of other SBE I genomic clones 

isolated 

All ten members of the A,El-like class of SBE I 
genomic clones were investigated by hybridisation with 

15 probes derived from fragment El . 7 (sequence wSBE I-D45, 
encoding the translation start signal and the first 
100 amino acids from the N-terminal end and intron 
sequences; see Table 1) and from fragment El . 5 (sequence 
wSBE I-D43 , corresponding largely to the 3 ' untranslated 

20 sequence and containing intron sequences, see Table 1). The 
results obtained were consistent with one type of gene being 
isolated in different fragments in the different clones, as 
shown in Figure 7. The PCR products were obtained from the 
clones A.E1, 2, 9, 14, 27, 31 and 52. These hybridised to 

2 5 wSBE I-D45 using primers that amplify near the 5' end of the 

gene (positions 5590-6162 of wSBE I-D4) . Sequencing showed 
no differences in sequence of a 200 bp product. 

Analysis of the promoter for wSBE I-D4 allows us 
to investigate the presence of motifs previously described 
30 for promoters that regulate gene expression in the 

endosperm. Forde et al (1985) compared prolamin promoters, 
and suggested that the presence of a motif approximately 
-300 bp upstream of the transcription start point, called 
the endosperm box, was responsible for endosperm-specific 

3 5 expression. The endosperm box was subsequently considered 

to consist of two different motifs: the endosperm motif (EM) 
(canonical sequence TGTAAAG) and the GCN 4 motif (canonical 
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sequence G/ATGAG/CTCAT) . The GCN4 box is considered to 
regulate expression according to nitrogen availability 
{Muller and Knudsen, 1993) . The wSBE I-D4 promoter contains 
a number of imperfect EM-like motifs at approximately -100, 
5 -300 and -400 as well as further upstream. However, no GCN4 
motifs could be found, which lends support to the idea that 
this motif regulates response to nitrogen, as starch 
biosynthesis is not as directly dependent on the nitrogen 
status of the plant as storage protein synthesis. Comparison 

10 of the promoters for wSBE I-D4 and D2 (Rahman et al , 1997) 
indicates that although there are no extensive sequence 
homologies there is a region of about 100 bp immediately 
before the first encoded methionine where the homology is 
61% between the two promoters. In particular there is an 

15 almost perfect match in the sequence over twenty base pairs 
CTCGTTGCTTCC / TACTCCACT , (positions 4723-4742 of the wSBE I 
sequence), but the significance of this is hard to gauge, as 
it does not occur in the rice promoter for SBE I. The 
availability of more promoters for starch biosynthetic 

20 enzymes may allow firmer conclusions to be drawn. There are 
putative CAAT and TATA motifs at positions 4870 and 4830 
respectively of wSBE I-D4 sequence. The putative start of 
translation of the mRNA is at position 4900 of wSBE I-D4. 

Figure 5 shows the structure of the wSBE I-D4 

25 gene, compared with the genes from rice and wheat (Kawasaki 
et al, 1993; Rahman et al , 1997) . The rice SBE I has 14 
exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2. 
There is good conservation of exon-intron structure between 
the three genes, except at the extreme 5' end. In particular 

30 the sizes of intron 1 and intron 2 are very different 
between rice SBE I and wSBE I-D4 . 



Example 9 Isolation of cDNA for SBE I 

Using the maize starch branching enzyme I cDNA as 
35 a probe (Baba et al , 1991) , 10 positive plaques were 

recovered by screening approximately 10 plaques from a 
wheat endosperm cDNA library prepared from the cultivar 
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Five cDNA clones were sequenced and their 
sequences were assembled into one contiguous sequence using 
a GCG program '(Devereaux et al, 1984). The arrangement of 
these sequences is illustrated in Figure 8, the nucleotide 
5 sequence is shown in SEQ ID No : 5 , and the deduced amino acid 
sequence is shown in SEQ ID No : 6 . The intact cDNA sequence, 
wSBE I-D4 cDNA, is 2687 bp and contains one large open 
reading frame (ORF) , which starts at nucleotides 11 to 13 
and ends at nucleotides 2432 to 2434. It encodes a 

10 polypeptide of 807 amino acids with a molecular weight of 
87 kDa . Comparison of the amino acid sequence encoded by 
wSBE I-D4 cDNA with that encoded by maize and rice SBE I 
cDNAs showed that there is 75-80% identity between any of 
two these sequences at the nucleotide level and almost 90% 

15 at the amino acid level. Alignment of these three 

polypeptide sequences, as shown in Figure 4, along with the 
deduced sequences for pea, potato and wSBE I-D2 type cDNA, 
indicated that the sequences in the central region are 
highly conserved, and sequences at the 5' end (about 

20 80 amino acids) and the 3' end (about 60 amino acids) are 
variable . 

Svensson et al (1994) indicated that there were 
several invariant residues in sequences of the a-amylase 
super-family of proteins to which SBE I belongs. In the 

25 sequence of maize SBE I these are in motifs commencing at 
amino acid residue positions 341, 415, 472, 537 
respectively; these are also encoded in the wSBE I-D4 
sequence (SEQ ID No : 9 ) , further supporting the view that 
this gene encodes a functional enzyme. This is in contrast 

3 0 to the results with the wSBE I~D2 gene, where three of the 
conserved motifs appear not to be encoded (Rahman et al, 
1997) . 

There is about 90% sequence identity in the 
deduced amino acid sequence between wSBE I-D4 cDNA and rice 
3 5 SBE I cDNA in the central portion of the molecule (between 

residues 160 and 740 for the deduced amino acid product from 
wSBE I-D4 cDNA) . The sequence identity of the deduced amino 
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1997 and this specification) , and so we may have accounted 
for most of the genes in T. tauschi i and, by extension, the 
genes from the D genome of wheat. In wheat, at least two 
hybridising bands could be assigned to each of 
5 chromosomes 7A, 7B and 7D. 

Example 10 Tissue specificity and expression during 

endosperm development 
The 300 bp of 3' untranslated sequence of 

10 wSBE I-D4 cDNA does not show any homology with either the 
wSBE I~D2 type cDNA that we have described earlier (Rahman 
et al, 1997) or with BE-I from rice, as shown in Figure 5. 
We have called this sequence wSBE I-D43C (see SEQ ID No : 9 ) . 
It seemed likely that wSBE I-D43C would be a specific probe 

15 for this class of SBE-I, and thus it was used to investigate 
the tissue specificity. Hybridization of RNA from endosperm 
of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, 
DBE I, wheat actin, and wheat ribosomal RNA was examined. 
RNA was purified at various numbers of days after anthesis 

20 from plants grown with a 1G h photoperiod at 13 °C (night) 
and 18 °C (day) . The age of the endosperms from which RNA 
was extracted in days after anthesis is given above the 
lanes in the blot. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes used are identified 

2 5 in Tables 1 and 2 . 

The results are shown in Figures 9a to 9g. An RNA 
species of about 2700 bases in size was found to hybridise. 
This is very close to the size of the wSBE I-D4 cDNA 
sequence. RNA hybridising to WSBE-I-D43C is most abundant 

3 0 at the mid- stage of endosperm development, as shown in 

Figure 9a, and in field grown material is relatively 
constant during the period 12-18 days, the time at which 
there is rapid starch and storage protein accummulation 
(Morell et al, 1995) . 
3 5 The sequence contained within the wSBE I-D4 gene 

appears to be expressed only in the endosperm (Figure 9a, 
Figure 9b). We could not detect any expression in the leaf. 
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This could be because another isoform is expressed in the 
leaf, and/or because the amount of SBE I present in the leaf 
is much less than what is required in the endosperm. 
Isolation of SBE I clones from a leaf cDNA library would 
5 enable this question to be resolved. 

E xample 11 Intron-Exon Structure of SBE I 

By comparison of the cDNA sequence of SBE I 
(Repellin et al. 1997) with that of wSBE I-D4 we can deduce 

10 the intron-exon structure of the gene for the major isoform 
of SBE I that is found in the endosperm. The structure 
contains 14 exons compared to 14 for rice (Kawasaki et al . 
1993) . These 14 exons are spread over 6 kb of sequence, a 
distance similar to that found in both rice SBE I and 

15 wSBE I-D2. A dotplot comparison of wSBE I-D4 sequence and 
that of rice SBE I sequence, depicted in Figure 10, shows 
good sequence identity over almost the entire gene starting 
from about position 5100 of wSBE I-D4; the identity is poor 
over the first 5 kb of sequence corresponding largely to the 

20 promoter sequences. The sequence identity over introns 
(about 60%) is lower than over exons (about 85%) . 

Examp 1 e 12 Repeated Sequences in SBE I 

Sequencing of wSBE I-D4 revealed there was a 

25 repeated sequence of at least 300 bp contained in a 2kb 

fragment about 600 bp after the 3' end of the gene. We have 
called this sequence wSBE I-D4R (SEQ ID NO: 9). This 
repeated sequence is within fragment El . 5 (Figure 3 and 
Table 1) and is flanked by non-repetitive sequences from the 

30 genomic clone. We have previously shown that the 

restriction pattern obtained by digesting XE1 with the 
restriction enzyme BamHI is also obtained when T. tauschu 
DNA is digested. Thus wSBE I-D4R is unlikely to be a 
cloning artefact. A search of the GenBank Database revealed 

35 that wSBE I-D4R shared no significant homology with any 

sequence in the database. Hybridisation experiments with 
wSBE I-D4R showed that all of the other SBE I-D4 type 
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genomic clones (except number 29) contained this repeated 
sequence (data not shown) . The wSBE I-D4R sequence was not 
highly repeated and occurred in the wheat genome with a 
similar frequency as the vjSBE I-D4 sequence. 
5 When SBE I-D4R was used as the probe on wheat DNA 

from the nulli-tetra lines, four bands were obtained; two of 
these bands could be assigned to chromosome 7A and the 
others to chromosomes 7B and 7D (Figure 11) . One of the two 
BamHI fragments from wheat DNA which could be assigned to 

10 chromosome 7A was distinct from the single band from 

chromosome 7A detected using wSBE I-D43 as the probe; the 
other three bands coincided in the autoradiograph with bands 
obtained with wSBE X-D43, and are likely to represent the 
same fragment. However, one of these fragments was distinct 

15 from the BamHI fragment that hybridised to the wSBE I-D43 
sequence. In wSBE I-D4 (see SEQ ID No : 9 ) , the wSBE I-D43 
sequence is only 300 bp upstream of wSBE I-D4R, and occurs 
in the same BamHI fragment. These results suggest that the 
wSBE I-D4R sequence can occur independently of wSBE I-D4 in 

2 0 the wheat genome. 

Example 13 Isolation of Genomic Clones Encoding SBE II 

Screening of a cDNA library, prepared from the 
wheat endosperm as described in Example 4, with the maize 

25 BE I clone (Baba et al, 1991) at low stringency led to the 

isolation of two classes of positive plaques. One class was 
strongly hybridising, and led to the isolation of wheat 
SBE I-D2 type and SBE I-D4 type cDNA clones, as described in 
Example 5 and in Rahman et al (1997). The second class was 

30 weakly hybridising, and one member of this class was 

purified. This weakly hybridising clone was termed SBE-9, 
and on sequencing was found to contain a sequence that was 
distinct from that for SBE I . This sequence showed greatest 
homology to maize BE II sequences, and was considered to 

35 encode part of the wheat SBE II sequence. 

c 

The screening of approximately 5 x 10 plaques 
from a genomic library constructed from T. tauschii (see 
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Example 1) with the SBE-9 sequence led to the isolation of 
four plaques that were positive. These were designated 
wSBE JJ-D1 to wSBE II-D4 respectively, and were purified and 
analysed by restriction mapping. Although they all had 
5 different hybridization patterns with SBE-9, as shown in 

Figure 12, the results were consistent with the isolation of 
the same gene in different-sized fragments. 



Example 14 Identification of the N-terminal sequence of 

10 SBE II 

Sequencing of the SBE II gene contained in 
clone 2, termed SBE II-D1 (see SEQ ID No:10), showed that it 
coded for the N-terminal sequence of the major isoform of 
SBE II expressed in the wheat endosperm, as identified by 

15 Morell et al (1997) . This is shown in Figure 13. 

Example 15 Intron-Exon Structure of the SBE II Gene 

In addition to encoding the N-terminal sequence of 
sBE II, as shown in Example 10, the cDNA sequence reported 
20 by Nair et al (1997) was also found to have 100% sequence 

identity with part of the sequence of wSBE II-D1. Thus the 
intron-exon structure can be deduced, and this is shown in 
Figure 14. The positions of exons and other major structural 
features of the SBE II gene are summarized in Table 2. 

25 

Example 16 Number of SBE II Genes in T. tauschii and 

Wheat 

Hybridisation of the SBE II conserved region with 
T. tauschii DNA revealed the presence of three gene classes. 
30 However, in our screening we only recovered one class. 

Hybridisation to wheat DNA indicated that the locus for 
SBE II was on chromosome 2, with approximately 5 loci in 
wheat; most of these appear to be on chromosome 2D, as shown 
in Figure 15 . 

35 
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Table 2 

Positions of structural features in wSBE II-Dl • 



A. Positions of exons 



10 



15 



20 



25 



30 



number 


Genomic 


Genomic 




start 


finish 


1 


1058 


1336 


2 


1664 


1761 


3 


2038 


2279 


4 


2681 


2779 


5 


2949 


2997 


6 


3145 


3204 


7 




.J \J £a \J 


8 


3704 


3825 


9 


4110 


4188 


10 


4818 


4939 


11 


5115 


5234 


12 


6209 


6338 


13 


6427 


6549 


14 


6739 


6867 


15 


7447 


7550 


16 


8392 


8536 


17 


9556 


9703 


18 


9839 


9943 


19 


10120 


10193 


20 


10395 


10550 


21 


10928 


11002 


22 


11092 


11475 



35 



40 



45 



Other structural features within the wSBE II-Dl DMA 
sequence 



Putative initiation of translation 
Mature N-terminal sequence of SBE II 
wSBE II-D13 

Endosperm box like motif TGAAAAGT 
Endosperm box like motif TGAAAGT 
Endpsperm box like motif CGAAAAT 
Endosperm box like motif TAAATGT 
CAAAAT motif 
TCAATT motif 
TAT AAA motif 
AATTAA motif 



1214 
1681 

11116 to 11448 

521 

565 

669 

768 

784 

1108 

799 

1110 
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Example 17 Expression of SBE II 

Investigation of the pattern of express ion of 
SBE II revealed that the gene was only expressed in the 
endosperm. However the timing of expression was quite 
5 distinct from that of SBE I, as illustrated in 
Figures 9a, 9b and 9c. 

SBE I gene expression is only clearly detectable 
from the mid-stage of endosperm development (10 days after 
anthesis in Figure 9b) , whereas SBE II gene expression is 
10 clearly seen much earlier, in endosperm tissue at 5-8 days 
after development (Figures 9a and 9c), corresponding to an 
early stage of endosperm development. The hybridisation of 
wheat endosperm mRNA with the actin and ribosomal RNA genes 
is shown as controls (Figures 9fa and 9g, respectively) . 

15 

Example 18 Cloning of Wheat Soluble Starch Synthase 

cDNA 

A conserved sequence region was used for the 
synthesis of primers for amplification of SSS I by 

20 comparison with the nucleotide sequences encoding soluble 
starch synthases of rice and pea. A 300 bp RT-PCR product 
was obtained by amplification of cDNA from wheat endosperm 
at 12 days post anthesis. The 300 bp RT-PCT product was 
then cloned, and its sequence analysed. The comparison of 

25 its sequence with rice SSS cDNA showed about 80% sequence 

homology. The 3 00 bp RT-PCR product was 100% homologous to 
the partial sequence of a wheat SSS I in the database 
produced by Block et al (1997) . 

The 3 00 bp cDNA fragment of wheat soluble starch 

30 synthase thus isolated was used as a probe for the screening 
of a wheat endosperm cDNA library (Rahman et al, 1997) . 
Eight cDNA clones were selected. One of the largest cDNA 
clones { sm2 ) was used for DNA sequencing analysis , and gave 
a 2662 bp nucleotide sequence, which is shown in SEQ ID 

35 NO: 14. A large open reading frame of this cDNA encoded a 
647 amino acid polypeptide, starting at nucleotides 247 to 
2 50 and terminating at nucleotides 2198 to 2200. The 
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deduced polypeptide was shown by protein sequence analysis 
to contain the N-terminal sequence of a 75 kDa granule-bound 
protein (Rahman et al, 1995). This is illustrated in 
Figure 16. The location of the 75 kDa protein was 
5 determined for both the soluble fraction and starch granule- 
bound fraction by the method of Denyer et al (1995) . Thus 
this cDNA clone encoded a polypeptide comprising a 41 amino 
acid transit peptide and a 606 amino acid mature peptide 
(SEQ ID NO: 12) . The cleavage site LRRL was located at amino 
10 acids 36 to 39 of the transit peptide of this deduced 
polypeptide . 

Comparison of wheat SSS I with rice SSS and potato 
SSS showed that there is 87.4% or 75.9% homology at the 
amino acid level and 74.7% or 58.1% homology at the 
15 nucleotide level. Some amino acids in the at N-terminal 
sequences of the SSS I of wheat and rice were conserved. 
Major features of the SSS I gene are summarized in Table 3. 

Example 19 Isolation of Genomic Clone of Wheat Soluble 

2 0 Starch Synthase 

Seven genomic clones were obtained with a 3 00 bp 
cDNA probe by screening approximately 5 x 10 plaques from a 
genomic DNA library of Triticum tauschii, as described 
above, DNA was purified from 5 of these clones and digested 

25 with BamHI and SacI . Southern hybridization analysis using 
the 3 00 bp cDNA as probe showed that these clones could be 
classified into two classes, as shown in Figure 17. One 
genomic clone, sg3 , contained a long insert, and was 
digested with BamHI or SacI and subcloned into pBluescript 

30 KS+ vector. 
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Table 3 

Comparison of exons and introns of soluble starch synthases 

I genes of wheat and rice 

(1) Identity of exons of soluble starch synthase I genes of 
wheat and rice 
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region of exon 1 . 

Exon 15a: coding region of exon 15. Exon 15b: non- 
coding region of exon 15 . 

wSSI-Dl: wheat soluble starch synthase I gene. 

rSSI: rice soluble starch synthase I gene. 
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These subclones were analysed by sequencing. The 
intron/exon structure of the sg3 rice gene is shown in 
Figure 18. The SSS I gene from T. tauschii is shown in SEQ 
ID No: 13, while the deduced amino acid sequence is shown in 
5 SEQ ID NO : 14 . 

Example 2 0 Northern Hybridization Analysis of the 

Expression of Genes Encoding Soluble Starch 
Synthase 

10 Total RNAs were purified from leaves, pre-anthesis 

material, and various stages of developing endosperm at 5-8, 
10-15 and 18-22 days post anthesis. Northern hybridization 
analysis showed that mRNAs encoding wheat SSS I were 
specifically expressed in developmental endosperm. 

15 Expression of this mRNAs in the leaves and pre-anthesis 

materials could not be detected by northern hybridization 
analysis under this experimental condition. Wheat SSS I 
mRNAs started to express at high levels at an early stage of 
endosperm, 5-8 days post anthesis, and the expression level 

20 in endosperm at 10-15 days post anthesis, was reduced. 

These results are summarized in Figure 9a and Figure 9d. 

Example 21 Genomic Localisation of Wheat Soluble Starch 

Synthase 

25 DNA from chromosome engineered lines was digested 

with the restriction enzyme BamHI and blotted onto supported 
nitrocellulose membranes. A probe prepared from the 3' end 
of the cDNA sequence, from positions 2345 to 2548, was used 
to hybridise to this DNA. The presence of a specific band 

30 was shown to be associated with the presence of 

chromosomes 7A (Figure 19). These data demonstrate location 
of the SSS I gene on chromosome 7 . 

Example 2 2 Isolation of SSS I Promoter 

35 We have isolated the promoter that drives this 

pattern of expression for SSS I. The pattern of expression 
for SSS I is very similar to that for SBE II: the SSS I gene 
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transcript is detectable from an early stage of endosperm 
development until the endosperm matures . The sequence of 
this promoter is given in SEQ ID No: 15. 



5 Example 2 3 Isolation of the Gene Encoding Debranching 

Enzyme from Wheat 
The sugary- 1 mutation in maize results in mature 
dried kernels that have a glassy and translucent appearance; 
immature mature kernels accumulate sucrose and other simple 

10 sugars, as well as the water-soluble polysaccharide 

phytoglycogen (Black et al, 1966). Most data indicates that 
in sugary- 1 mutants the concentration of amy lose is 
increased relative to that of amylopec t ion . Analysis of a 
particular sugary- 1 mutation {su-lRef) by James et al , 

15 (1995) led to the isolation of a cDNA that shared 

significant sequence identity with bacterial enzymes that 
hydro lyse the a 1,6 -glucosyl linkages of starch, such as an 
isoamylase from Pseudomonas ( Amemura et al , 1988), i_e . 
bacterial debranching enzymes . 

20 We have now isolated a sequence amplified from 

wheat endosperm cDNA using the polymerase chain reaction 
( PCR) . This sequence is highly homologous to the sequence 
for the sugary gene isolated by James et al , (1995) . This 
sequence has been used to isolate homologous cDNA sequences 

25 from a wheat endosperm library and genomic sequences from 
Triticum tauschii. 

Comparison of the deduced amino acid sequences of 
DBE from maize with spinach (Accession SOPULSPO, GenBank 
database), Pseudomonas (Amemura et al, 1988) and rice 

3 0 (Nakamura et al , 1997) enabled us to deduce sequences which 
could be useful in wheat. When these sequences were used as 
PCR amplification primers with wheat genomic DNA a product 
of 2 56 bp was produced. This was sequenced and was compared 
to the sequence of maize sugary isolated by James et al, 

35 (1995). The results are shown in Figure 20a and Figure 20b. 
This sequence has been termed wheat debranching enzyme 
sequence I (WDBE-I) . 
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WDBE-1 was used to investigate a cDNA library 
constructed from wheat endosperm (Rahman et al, 1997) 
enables us to isolate two cDNA clones which hybridise 
strongly to the WDBE-I probe. The nucleotide sequence of 
5 the DNA insert in the longest of these clones is given in 
SEQ ID No : 16 . 

Use of WDBE 1 to investigate a genomic library 
constructed from T. tauschii, as described above has led to 
the isolation of four genomic clones, designated II, 12, 13 
10 and 14, respectively, which hybridised strongly to the 

WDBE-I sequence. These clones were shown to contain copies 
of a single debranching enzyme gene. The sequence of one of 
these clones, 12, is given in SEQ ID No: 17. The intron/exon 
structure of the gene is shown in Figure 20c. Exons 1 to 4 
15 were identified by comparison with the maize sugary-1 cDNA, 
while Exons 5 to 18 were identified by comparison with the 
cDNA sequence given in SEQ ID No: 16. The major features of 
the DBE I gene are summarized in Table 4. 

Hybridization of WDBE-I to DNA from T . tauschii 
20 indicates one hybridizing fragment (Figure 21a) . The 
chromosomal location of the gene was shown to be on 
chromosome 7 through hybridisation to nullisomic / t etrasomic 
lines of the hexaploid wheat cultivar Chinese Spring 
(Figure 21b) . 

2 5 We have clearly isolated a sequence from the wheat 

genome that has high identity to the debranching enzyme cDNA 
of maize characterised by James et al (1997). The isolation 
of homologous cDNA sequences and genomic sequences enables 
further characterisation of the debranching enzyme cDNA and 

3 0 promoter sequences from wheat and T. tauschii. These 

sequences and the WDBE I sequences shown herein are useful 
in the manipulation of wheat starch structure through 
genetic manipulation and in the screening for mutants at the 
equivalent sugary locus in wheat. 
3 5 Figure 9e shows that the DBE I gene is expressed 

during endosperm development in wheat and that the timing of 
expression is similar to the SBEII and SSSI genes. Figure 9h 
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shows that the full length mRNA for the gene (3.0 kb) is 
found only in the wheat endosperm. 

Example 2 4 Transient assays of Promoter-GFP Fusions 

5 DNA constructs 

DNA constructs for transient expression assays 
were prepared by fusing sequences from the BEII and SSI 
promoters to the gene encoding the Green Fluorescent 
Protein. Green Fluorescent Protein (GFP) constructs 
10 contained the GFP gene described by Sheen et al. (1995). The 
nos 3' element (Bevan et al. , 1983) was inserted 3' of the 
GFP gene. The plasmid vector (pWGEM_NZfp) was constructed by 
inserting the NotI to Hindlll fragment from the following 
sequence : 

15 

5' GCGGCCGCTC CCTGGCCGAC TTGGCGGAAG CTTGCATGCC TGCAGGTCGA 
CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA ATTCATCGAT GATATCAGAT 
CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3 ' 

20 into the NotI and Hindi I I sites of pGem-13Zf{-) vector 

(Promega) . The sequences at the junction of the wSSSIprol 
and wSSSIpro2 and GFP were identical, and included the 
junction sequence : 

2 5 5' . . . .CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 

3 ' . 

The sequence at the junction of wsbellprol and GFP was: 

3 0 5' GCGACTGGCT GACTCAATCA CTACGCGGGG ATCCATGGTG AGCAAGGGCG 

3 ' . 

The sequence at the junction of wsbeIIpro2 and GFP was: 

5' GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 
35 3 ' . 

The structures of the constructs are shown in Figures 22a to 

22f . 
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Table 4 

Structural features of wDBEI-Dl 
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Endosperm box like motif TAAAACG 1463 
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Preparation of target tissue 

All explants used for transient assay were from 
the hexaploid wheat cultivar, Milliwang. Endosperm (10 - 12 
days after anthesis), embryos (12 - 14 days after anthesis) 
5 and leaves (the second leaf from the top of plants 

containing 5 leaves) were used. Developing seed or leaves 
were collected, surface sterilized with 1.25% w/v sodium 
hypochlorite for 20 minutes and rinsed with sterile 
distilled water 8 times. Endosperms or embryos were 

10 carefully excised from seed in order to avoid contamination 
with surrounding tissues. Leaves were cut into 0.5 cm x 1 
cm pieces. All tissues were aseptically transferred onto 
SD1SM medium, which is an MS based medium containing 1 mg/L 
2,4-D, 150 mg/L L-asparagine , 0.5 mg/L thiamine, 10 g/L 

15 sucrose, 36 g/L sorbitol and 36 g/L mannitol. Each agar 

plate contained either 12 endosperms, 12 embros or 2 leaf 
segments . 



Preparation of gold particles and bombardment 

20 Five jig of each plasmid was used for the 

preparation of gold particles, as described by Witrzens et 
al . (1998). Gold particle-DNA suspension in ethanol {10 jxl ) 
was used for each bombardment using a Bio-Rad helium-driven 
particle delivery system, PDS-1000 . 

25 

GFP assay 

The expression of GFP was observed after 3 6 to 72 
hours incubation using a fluorescence microscope. Two plates 
were bombarded for each construct. The numbers of expressing 
30 regions were recorded for each target tissue, and are 

summarized in Table 5. The intensity of the expression of 
GFP from each of the promoters was estimated by visual 
comparison of the light intensity emitted, and is summarized 
in Table 6 . 

3 5 The DNA construct containing GFP without a 

promoter region (pZLGFPNot) gave no evidence of transient 
expression in embryo (panel 1) or leaf (panel r) and 
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extremely weak and sporadic expression in endosperm (panel 
f) , this construct gave only very weak expression in 
endosperm with respect to the number (Figure 5) and 
intensity (Figure 6) of transient expression regions. The 
5 constructs pwssslprolgf pNOT {panels b, h and n) , 

psbellprolgfpNOT (panels d, j and p) , and psbeIIpro2gfpN0T 
(panels e, k and q) yielded low numbers (Table 5) of 
strongly (Table 6) expressing regions in leaves, and there 
was a very uneven distribution of expressing regions between 

10 target leaf pieces (Table 5) . pwsssIpro2gf pNOT (panels c, i 
and o) gave no evidence of transient expression in leaves 
(Table 5). These results show that each of the promoter 
constructs is able to drive the transient expression of GFP 
in the grain tissues, endosperm and embryo. The ability of 

15 the short SSI promoter (pwsssIpro2gf pNOT containing 1042 bp 
5' of the ATG translation start site) to drive expression in 
leaves (panel n) contrasts with the inability of the long 
SSI promoter (pwsssIpro2gf pNOT containing 3914 base pair 
region 5' of the ATG translation start site, panel o) ) 

20 suggesting that regions for controlling tissue specificity 
are located between -3914 and -1042 of the SSI promoter 
region (SEQ ID No: 15). 

Examp 1 e 2 5 Stable transformation of rice 

25 Stable transformation of rice using Agrobacterium 

was carried out essentially as described by Wang et al . 
1997. The plasmids containing the target DNA constructs 
containing the promoter-reporter gene fusions are shown in 
Figure 23 . These plasmids were transformed into 
3 0 Agrobacterium tumefaciens AGL1 by electroporation , and 
cultured on selection plates of LB media containing 
rif ampicillin (50 mg/L) and spectinomycin (50 mg/L) for 2 to 
3 days, and then gently suspended in 10 ml NB liquid medium 
containing 100 [iM acetosyringone and mixed well, Embryogenic 
35 rice calli (2 to 3 months old) derived from mature seeds 
were immersed in the A. tumefaciens AGL1 
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Table 6 

Comparison of the Intensities of Transient Expression 

Tissue pact_j pwsssl pwsssl psbell psbell pZLGFP 

s- - - - - Not 

gfg_no prolgf pro2gf prolgf pro2gf 

s pNOT pNOT pNOT pNOT 

Endosperm 10 4 2.5 3.5 1.5 0.5 

Embryo 10 5.5 5.5 1.5 1 0 

Leaf 10 20 0 10 10 0 



5 All intensities are relative to pac t_j s-gf g_nos transient 
expression in the target tissue 

Relative intensities were independently scored by three 
researchers and averaged. 
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suspension. After 3-10 minutes the A . tumef aciens AGLl 
suspension medium was removed, and the rice calli were 
transferred to NB medium containing 100 |XM acetosyringone 
for 48 h. The co-cultivated calli were washed with sterile 
5 Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove 
all Agrobacterium, plated on to NB medium containing 150 
mg/L timentin and 30 mg/L hygromycin, and cultured for 3 to 
4 wee )cs . Newly-formed buds on the surface of rice calli were 
excised and plated onto NB Second Selection medium 

10 containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 
weeks of proliferation calli were plated onto NB Pre- 
Regeneration medium containing 150 mg/L timentin and 50 mg/L 
hygromycin, and cultured for 2 weeks. The calli were then 
transferred on to NB-Regeneration medium containing 150 mg/L 

15 timentin and 50 mg/L hygromycin for 3 to 4 weeks. Once 

shooting occurs, shoots are transferred onto rooting medium 
(Vi MS) containing 50 mg /L hygromycin. Once adequate root 
formation occurs, the seedlings are transferred to soil, 
grown in a misting chamber for 1-2 weeks, and grown to 

20 maturity in a containment glasshouse. 

Example 2 6 Use of probes from SSS I, SBE 1, SBE II and 

DBE sequences to identify null or altered 
alleles for use in breeding programmes 
DNA primer sets were designed to enable 
amplification of the first 9 introns of the SBE II gene 
using PCR. The design of the primer sets is illustrated in 
Figure 24. Primers were based on the wSBE II-Dl sequence 
(deduced from Figure 13b and Nair et al, 1997; Accession No. 
Y11282) and were designed such that intron sequences in the 
wSBE II sequence were amplified by PCR. These primer sets 
individually amplify the first 9 introns of SBE II. One 
primer (sr913F) contained a fluorescent label at the 5' end. 
Following amplification, the products were digested with the 
restriction enzyme Ddel and analysed using an ABI 377 DNA 
Sequencer with Genescan™ fragment analysis software. One 
primer set, for intron 5, was found to amplify products from 
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each of chromosomes 2A, 2B and 2D of wheat. This is shown in 
Figure 25, which illustrates results obtained with' various 
wheat lines, and demonstrates that products from each of the 
wheat genomes from diverse wheats were amplified, and that 
5 therefore lines lacking the wSBEII gene on a specific 
chromosome could be readily identified. Lane (iii) 
illustrates the identification of the absence of the A 
genome wSBEII gene from the hexaploid wheat cultivar Chinese 
Spring ditelosomic line 2AS . 

10 Figure 26 compares results of amplification with 

an Intron 10 primer set for various nullisomic/ tetrasomic 
lines of the hexaploid wheat Chinese Spring. Fluorescent 
dUTP deoxynucleotides were included in the amplification 
reaction. Following amplification, the products were 

15 digested with the restriction enzyme Ddel and analysed using 
an ABI 377 DNA Sequencer with Genescan™ fragment analysis 
software. In lane (i) Chinese Spring ditelosomic line 2AS, a 
300 base product is absent; in lane (ii) N2BT2A, a 204 base 
product is absent, and in lane (iii) N2DT2B a 191 base 

20 product is absent. These results demonstrate that the 
absence of specific wSBEII genes on each of the wheat 
chromosomes can be detected by this assay. Lines lacking 
wSBEII forms can be used as a parental line for breeding 
programmes for generation of new lines in which expression 

25 of SBE II is diminished or abolished, with consequent 

increase in amylose content of the wheat grain. Thus a high 
amylose wheat can be produced. 

Table 7 shows examples primers pairs for SBE I, 
SSS I and DBE I which can identify genes from individual 

30 wheat genomes and could therefore be used to identify lines 
containing null or altered alleles. Such tests could be used 
to enable the development of wheat lines carrying null 
mutations in each of the genomes for a specific gene (for 



BN8DOCU>. <WO W14314A1JL> 



WO 99/14314 



- 55 - 



PCT/AU98/00743 



rH 

XI 
<d 



CO 
CD 

CD 

03 
•H 
tQ 

a> 



CO 

o 

•rl 

O 

n 

CO 

M 
0 

4H 

03 

M 

Q) 

£ 
•rl 

M 
04 

a 



u 

■a ~ 

o & 



Pi ~ 



o 



w 

a) 
g 

M 

04 

0 

CO 

u 

> 
PS 



<D 
03 
ri 
<D 
> 



a> 
o 



01 

M 

<D 

6 
•rl 

M 

n 
u 



(d 

I 



a> 
£ 

•H 
U 
to 



u 

s 

•rl 



&4 04 



0 
0 



3 



Q 

LO O 

id H o tn 

II o 
<< CQ u 



on 

m 



u 
o 

EH 
O 
O 



CJ 

CJ 
O 
eH 

< 
EH 
< 

EH 

o 
u 

EH 
< 

< 
u 
cj 



rH 

w 

tSl 



<J 
< 

i 

i o 
j o 
! o 

i 

o 
Eh 

Eh 

o 
o 
e> 

o 
o 
cj 

o 
cj 
e> 



13 

LO 

rH 
W 

tsl 



M 

w 

CO 



o o 
mom 
^ lo <o 

- U n 
< CQ Q 



LO 
LO 



a 
o 

eh 
< 
a 

a 

Eh 

a 

eh 
cj 
o 

eh 
< 

Eh 

EH 
< 

a 
< 
o 

a 
cj 
< 



CO 



EH 
CJ 

cj 

EH 

a 

a 
< 
o 

u 
u 
u 

o 
cj 
o 

EH 

cj 



&H 

rH 

o 

w 

CO 
CO 
CO 



CO 
CO 

CO 



Q 

o 
- o 

II O Lfl 

o 

CQ ^ If 



o 



cj 
o 

Vh 



00 

* 

LO 
LO 



CJ 
£H 

cj 

EH 

Eh 

CJ 

o 

cj 
< 

EH 

o 
o 
< 

cj 

EH 

o 
< 

EH 
CJ 

eh 

CJ 



rH 

CO 
N3 



< 
O 

O 
Eh 

CJ 

CJ 
CJ 
< 

EH 
O 

CJ 
CJ 
CJ 

< 

CJ 

EH 

CJ 

Eh 



U4 

rH 

CO 

to 

CO 



o o o 
cr> cn mj 



CQ Q (< 



t 

C 

O 



*H 

U-J 

•H 

a 
a) 

CO 



o 
o o 

k CM 
04 CN 



oa 

LO 



CJ 
CJ 
EH 
O 
E- 

< 
< 

< 

u 
< 
a 

CJ 

o 



Ph 
ro 

LO 
rH 

Jh 

CO 



CJ 
E- 
E^ 

<c 

CJ 
O 

CJ 
CJ 
< 



a 
<c 
o 

EH 
O 

CJ 
CD 

Eh 



rH 
W 
W 



rH 
W 

PQ 
Q 



LO 



BN8DOCID: <WK)_9»14314A1 JU» 



WO 99/14314 



PCT/AU98/00743 



- 56 - 

example SBEI, SSI or DBE I) or combinations of null alleles 
for different genes. 

It will be apparent to the person skilled in the 
art that while the invention has been described in some 
5 detail for the purposes of clarity and understanding, 

various modifications and alterations to the embodiments and 
methods described herein may be made without departing from 
the scope of the inventive concept disclosed in this 
specification . 

10 Reference cited herein are listed on the following 

pages, and are incorporated herein by this reference. 
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SEQUENCE LISTING 



( I ) GENERAL INFORMATION: 



5 (i) APPLICANT: 

(A) NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL 
RESEARCH ORGANISATION 

(B) STREET: Limestone Avenue 

(C) CITY: Campbell 
10 (D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2612 

(A) NAME: THE AUSTRALIAN NATIONAL UNIVERSITY 
15 (B) STREET: BRIAN LEWIS CRESCENT 

(C) CITY: ACTON 

(D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2601 

20 

(A) NAME: GOODMAN FIELDER LIMITED 

(B) STREET: LEVEL 42, GROSVENOR PLACE 

(C) CITY: SYDNEY 

(D) STATE: NSW 

2 5 (E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2000 

(A) NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED 

(B) STREET: LEVEL 31, I O'CONNELL STREET 

3 0 (C) CITY: SYDNEY 

(D) STATE: NSW 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2000 



3 5 (ii) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS 



(iii) NUMBER OF SEQUENCES: 17 



(iv) COMPUTER READABLE FORM: 
4 0 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patendn Release #1 .0, Version #1 .30 (EPO) 



4 5 (2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

5 0 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer based on the N-terminal sequence of wSBE I 5 * end at 

position 168 of SEQ ID NO:5" 

55 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1; 

10 

GGC ACGCGAG AG ACTGG 1 7 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer in which 5 * end is at position 1590 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

2 5 (iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

3 0 ( A) ORGANISM: triticum tauschii 

(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

3 5 TACATTTCCT TGTCCATCA 1 9 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 

4 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii ) MOLECULE TYPE: other nucleic acid 

4 5 (A) DESCRIPTION: /desc = "per primer 5 ' end is at position 1 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

50 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschit 

5 5 (F) TISSUE TYPE: Endosperm 



BN8DOCID: <WD_9914314A1JL» 



WO 99/14314 



PCT/AU98/00743 



- 63 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATCACG AGAG CTTGCTCA 1 8 

5 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

1 0 (D) TOPOLOGY, linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer 5 ' end is at position 334 of SEQ ID NO:5" 

1 5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGGTACACAG TTGCGTCATT TTC 23 

(2) INFORMATION FOR SEQ ID NO: 5: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2687 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

4 0 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 



45 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



ATCGACGAAG 


ATGCTCTGCC 


TCACCGCCCC 


CTCCTGCTCG 


CCATCTCTCC 


CGCCGCGCCC 


60 


CTCCCGTCCC 


GCTGCTGACC 


GGCCCGGACC 


GGGGATTTCG 


GCCAAGAGCA 


AGTTCTCTGT 


120 


TCCCGTGTCT 


GCGCCAAGAG 


ACTACACCAT 


GGCAACAGCT 


GAAGATGGTG 


TTGGCGACCT 


180 


TCCGATATAC 


GATCTGGATC 


CGAAGTTTGC 


CGGCTTCAAG 


GAACACTTCA 


GTTATAGGAT 


240 


GAAAAAGTAC 


CTTGACCAGA 


AACATTCGAT 


TGAGAAGCAC 


GAGGGAGGCC 


TTGAAGAGTT 


300 


CTCTAAAGGC 


TATTTGAAGT 


TTGGGATCAA 


CACAGAAAAT 


GACGCAACTG 


TGTACCGGGA 


360 
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ATGGGCCCCT GCAGCAATGG ATGCACAACT TATTGGTGAC TTCAAGAACT GG AATGGCTC 42 0 

TGGGCACAGG ATGACAAAGG ATAATTATGG TGTTTGGTCA ATCAGGATTT CCCATGTCAA 48 0 

5 

TGGGAAACCT GCCATCCCCC ATAATTCCAA GGTTAAATTT CGATTTCACC GTGGAGATGG 540 

ACTATGGGTC GATCGGGTTC CTGCATGGAT TCGTTATGCA ACTTTTGACG CCTCTAAATT 60 0 

10 TGGAGCTCCA TATGACGGTG TTCACTGGGA TCCACCTTCT GGTGAAAGGT ATGTGTTTAA 660 

GCATCCTCGG CCTCGAAAGC CTGACGCTCC ACGTATTTAC GAGGCTCATG TGGGGATGAG 720 

TGGTGAGAGG CCTGAAGTAA GCACATACAG AGAATTTGCA GACAATGTGT TACCGCGCAT 78 0 

15 

AAAGGCAAAC AACTACAAGA GAGTTCAGCT GATGGCAATG ATGGAACATT CCATATTATG 84 0 

CTTCTTTTGG TACGATGTGA CGAATTTGTT CGCAGTTAGC AGCAGATCAG GAACACCAGA 900 

2 0 GGACCTCAAA TATCTTGTTG ACAAGGGACA TAGCTTAGGG TTGGGTGTTC TGATGGATGT 96 0 

TGTCGATAGC CATGCGAGCA GTAATATGAC AGATGGTCTA AATGGCTATG ATGTTGGACA 102 0 

AAAC AC AC AG GAGTCCTATT TCCATACAGG AG AAAGGGG T TATCATAAAC TGTGGGATAG 10 80 

25 

TCGCCTGTTC AACTATGCCA ATTGGGAGGT CTTACGGTAT CTTCTTTCTA ATCTGAGATA 1140 

TTGGATGGAC GAATTCATGT TTGACGGCTT CCGATTTGAT GGAGTAACAT CCATGCTATA 12 0 0 

3 0 TAATCACCAT GGTATCAATA TGTCATTCGC TGGAAATTAC AAGGAATATT TTGGTTTGGA 126 0 

TACCGATGTA GATGCAGTTG TTTACATGAT GCTTGCGAAC CATTTAATGC ACAAAATCTT 13 2 0 

GCCAGAAGCA ACTGTTGTTG CAGAAGATGT TTCAGGCATG CCAGTGCTTT GTCGGTCAGT 13 80 

35 

TGATGAAGGT GGAGTAGGGT TTGACTATCG CCTTGCTATG GCTATTCCTG ATAGATGGAT 1440 

TGACTACTTG AAGAACAAAG ATGACCTTGA ATGGTCAATG AGTGCAATAG CACATACTCT 1500 

4 0 GACCAACAGG AG ATATAC GG AAAAGTGCAT TGCATATGCT GAGAGCCACG ATCAGTCTAT 15 6 0 

TGTTGGCGAC AAGACTATGG CATTTCTCTT GATGGACAAG GAAATGTATA CTGGCATGTC 162 0 

AGACTTGCAG CCTGCTTCAC CTACAATTGA TCGTGGAATT GCACTTCAAA AGATGATTCA 16 80 

45 

CTTCATCACC ATGGCCCTTG GAGGTGATGG CTACTTGAAT TTTATGGGTA ATGAGTTTGG 17 40 

CCACCCAGAA TGGATTGACT TTCCAAGAGA AGGCAACAAC TGGAGTTATG ATAAATGCAG 1800 

5 0 ACGCCAGTGG AGCCTCTCAG ACATTGATCA CCTACGATAC AAGTACATGA ACGCATTTGA 18 60 

TCAAGCAATG AATGCGCTCG ACGACAAGTT TTCCTTCCTA TCGTCATCAA AGCAGATTGT 192 0 

CAGCGACATG AATGAGGAAA AGAAGATTAT TGTATTTGAA CGTGGAGATC TGGTCTTCGT 19 80 

55 

CTTCAATTTT CATCCCAGTA AAACTTATGA TGGTTACAAA GTCGGATGTG ATTTGCCTGG 2 040 

GAAGTACAAG GTAGCTCTGG ACTCCGATGC TCTGATGTTT GGTGGACATG GAAGAGTGGC 2100 

6 0 CCAGTACAAC GATCACTTCA CGTCACCTGA AGGAGTACCA GGAGTACCTG AAACAAACTT 2160 

CAACAACCGC CCTAATTCAT TCAAAGTCCT GTCTCCACCC CGCACTTGTG TGGCTTACTA 2 220 

TCGCGTCGAG GAAAAAGCGG AAAAGCCTAA GGATGAAGGA GCTGCTTCTT GGGGCAAAGC 22 80 

65 

TGCTCCTGGG TACATCGATG TTGAAGCCAC TCGTGTCAAA GACGCAGCAG ATGGTGAGGC 2 3 40 



BN8DOCID: <WO 9d14314A1J_j» 



WO 99/14314 



- 65 - 



PCT/AU98/00743 



GACTTCTGGT TCCAAAAAGG CGTCTACAGG AGGTGACTCC AGCAAGAAGG GAATTAACTT 2 4 00 

TGTCTTCGGG TCACCTGACA AAGATAACAA ATAAGCACCA TATCAACGCT TGATCAGAAC 2 4 60 

5 CGTGTACCGA CGTCCTTGTA ATATTCCTGC TATTGCTAGT AGTAGCAATA CTGTCAAACT 2 520 

GTGCAGACTT GAGATTCTGG CTTGGACTTT GCTGAGGTTA CCTACTATAT AGAAAGATAA 2 5 80 

ATAAGAGGTG ATGGTGCGGG TCGAGTCCGG CTATATGTGC CAAATATGGG CCATCCCGAG 2640 

10 

TCCTCTGTCA TAAAGGAAGT TTCGGGCTTT CAGCCCAGAA TAAAAAA 2 6 87 

(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 807 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

25 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschit 
(F) TISSUE TYPE: Endosperm 

3 0 (ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1.. 807 

(D) OTHER INFORM ATION:/Iabel= sbel 
/note= "deduced amino acid sequence from SEQ ID NO:5 M 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Cys Leu Thr Ala Pro Ser Cys Ser Pro Ser Leu Pro Pro Arg 
15 10 15 

40 

Pro Ser Arg Pro Ala Ala Asp Arg Pro Gly Pro Gly lie Ser Ala Lys 

20 25 30 

Ser Lys Phe Ser Val Pro Val Ser Ala Pro Arg Asp Tyr Thr Met Ala 
45 35 40 45 

Thr Ala Glu Asp Gly Val Gly Asp Leu Pro He Tyr Asp Leu Asp Pro 
50 55 60 

5 0 Lys Phe Ala Gly Phe Lys Glu His Phe Ser Tyr Arg Met Lys Lys Tyr 

65 70 75 80 

Leu Asp Gin Lys His Ser He Glu Lys His Glu Gly Gly Leu Glu Glu 

85 90 95 

55 

Phe Ser Lys Gly Tyr Leu Lys Phe Gly He Asn Thr Glu Asn Asp Ala 

100 105 HO 

Thr Val Tyr Arg Glu Trp Ala Pro Ala Ala Met Asp Ala Gin Leu He 
60 115 120 125 
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Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 

130 135 140 

Asn Tyr Gly Val Trp Ser lie Arg lie Ser His Val Asn Gly Lys Pro 

5 145 150 155 160 

Ala lie Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 

165 170 175 

10 Gly Leu Trp Val Asp Arg Val Pro Ala Trp lie Arg Tyr Ala Thr Phe 

180 185 190 



15 



30 



45 



60 



Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 
195 200 205 

Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 
210 215 220 



Asp Ala Pro Arg lie Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 

20 225 230 235 240 

Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 

245 250 255 

2 5 lie Lys Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala lie Met Glu 

260 265 270 



His Ser lie Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 

275 280 285 

Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 

290 295 300 



Lys Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 

35 305 310 315 320 

His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 

325 330 335 

4 0 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 

340 345 350 



Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 

355 360 365 

Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 

370 375 380 



Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 

50 385 390 395 400 

Gly lie Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 

405 410 415 

55 Asp Thr Asp Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 

420 425 430 



Met His Lys He Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 
435 440 445 

Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 
450 455 460 
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Asp Tyr Arg Leu Ala Met Ala lie Pro Asp Arg Trp lie Asp Tyr Leu 

465 470 475 480 

Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala lie Ala His Thr 

5 485 490 495 

Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys lie Ala Tyr Ala Glu Ser 

500 505 510 

10 His Asp Gin Ser lie Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 

515 520 525 



15 



30 



45 



60 



Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 
530 535 540 

Thr lie Asp Arg Gly lie Ala Leu Gin Lys Met lie His Phe lie Thr 

545 550 555 560 



Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 
20 565 570 575 

Gly His Pro Glu Trp lie Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 

580 585 590 

2 5 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp lie Asp His Leu 

595 600 605 



Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 

610 615 620 

Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin lie Val Ser Asp Met 

625 630 635 640 



Asn Glu Glu Lys Lys lie lie Val Phe Glu Arg Gly Asp Leu Val Phe 

3 5 645 650 655 

Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 

660 665 670 

4 0 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 

675 680 685 



Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 

690 695 700 

Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 

705 710 715 720 



Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 
50 725 730 735 

Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 

740 745 750 

55 ser Trp Gly Lys Ala Ala Pro Gly Tyr lie Asp Val Glu Ala Thr Arg 

755 760 765 



Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 

770 775 780 

Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly lie Asn Phe Val Phe Gly 

785 790 795 800 
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Ser Pro Asp Lys Asp Asn Lys 

805 

(2) INFORMATION FOR SEQ ID NO: 7: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

15 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

20 

(ix) FEATURE: 

(A) NAME/KEY: misc_signal 

(B) LOCATION: 1..3 19 

(D) OTHER INFORMATION :/function= "3' untranslated region 

2 5 of wSBE I-D4 cDNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC 60 

30 

TTTGTCTTCG GGTCACCTGA CAAAGATAAC AAATAAGCAC CATATCAACG CTTGATCAGA 12 0 
ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA 180 

3 5 CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT 240 

AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG 300 
AGTCCTCTGT CATAAAGGA 319 

40 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4890 base pairs 

(B) TYPE: nucleic acid 

4 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
55 (A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 
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(A) NAMEyKEY: promoter 

(B) LOCATION: 1 ..4890 

(D) OTHER INFORMATION 7funciion= "promoter containing 
sequence of SBE I" 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTGGCGGG TCGGGCGGCA AGGCGCGGGG CGGCGGGGCG GCCGGGGCGG CGCGGCGGCG 60 

10 CGGGCGGCAG CGGGGGCTAG GGTTTCGCGG CGGCGGCGAC TTGGGCTGAG GCGGGGCACG 120 

GGCTGCGGCT TTAAAGGCCG GCCAGGCTGA GGTGTCCGGG TCGGACACGG CCCGTAAGGC 180 

GGTTGACTTT AAAAAATAAT AATTCGGACA TGCAAAAAAG TAAGAAAAGA AATAATAAAC 240 

15 

GGACTCCAAA AATCCCGAAG TAAATTTTTC CCCATTCTTA AAAATAAGCC GGACAAGATG 3 00 

AACATTTATT TGGGCCTAAA ATGCAATTTT GAAAAATGCG TATTTTTCCT AATTCGGAAT 3 60 

2 0 AAAATCAAAT AAAATCCAAA TAAAATCAAA TATTTGTTTT TAATATTTTT CCTCCAATAT 42 0 

TTCATTATTT GTGAAGAAGT CATTTTATCC CATCTCATAT ATTTTGATAT GAAATATTTT 48 0 

CGG AG AGAAA AATAATTAAA ACAAATGATC CTATTTTCAA AATTTGAGAA AACCCAAATA 54 0 

25 

TGAAAATAAC GAAATCCCCA ACTCTCTCCG TGGGTCCTTG AGTTGCGTGA AATTTCTAGG 60 0 

ATCACAAATC AAAATGCAAT AAAATATGAT ATGCATGATG ATCTAATGTA TAACATTCCA 66 0 

3 0 ATTGAAAATT TGGGATGTTA CATATAACTC AAATTCTATA ATTATGAACA CAGAAATATT 720 

AATGTAGAAC TCTATTTTGT TTTGAAATTG TATTATTTTT TAGAATTAGT CTAGAGCATT 7 80 

TCGTGAACTT GAATCAAACC TTTAAATAAA ACAAAGCATA AAAATGACAA ATTCACATAT 8 40 

35 

GAAATAACTT GTGTTACATA GATTTATTAC AATAGCGTTG TATGTGTGTA TGTGTGCGTG 9 00 

AGTGCCTATG GTAATATGAA TAAATATCTT GATAGATGTT TCTACAATTC ACGGGTCTAA 9 60 

4 0 CTAGTAATGC AATGCAATGC ATGCTAAAAG AATAGAACGT TAGTTTCATT TAACTAACAA 102 0 

TTTTCAAATG TATGAGTTGC CAACAAGTGG CATACTTGGC ACTGTTTGTT TGTTCATTTT 10 80 

ATGGAAAGTT CTTCTCTTTT TACATGGTTT AGATTCCAGC ATGTAGCCAG AAAATATGAT 1140 

45 

TGTCAAAAGA TAATACCTCA TAATACAATT CCACTAAAGT CACCTAGCCC AAGTGACCGA 12 00 

CCTGATCCTG AAATAAAATG AGAAGATTTG GTGTCATCAT CATGACAACA AATTATTAGG 12 60 

50 CGGTAGATCT TGTGGTAGTA GTCATGATGT AAAATTATCA AGAGGGAGAG AATG TATGG A 13 2 0 

GATTTATGTG AAGTACATCG TACACCAGAC ATAGTTGACA CATCGATTTT TTAAGATACA 13 80 

TTTGGACGCG CCTTGTGGGA GTGTAAAGTA CTACCATGTA TTAGAAGAGG TGAAATGAGA 14 40 

55 

AATG C CAT AG CTAGCAAGTA GGCCTAGTTA AGGAAATTCT TCCTTAGATC CCCTTCTCCC 150 0 

GAAGAGTGAA GTGCTTCAAC TAAAGGTTAG ACCCACTTAA AAAATGTCAC TTTGAATCTT 1560 

60 TGCTTCCCTT GTCGTAATCC TGTGCATTTG TAGGTCCCTC GGATCTGAGC CCTTTCTCCA 162 0 

AGCCCTTCAT TGGATTCCCC TGGATGTCTT TTTGTTACAT TTTATTGAAG TGAGAGTGAA 1680 

TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CGATACATGC 174 0 

65 
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40 
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55 



60 



65 



TAGAGTAGGG AGGAGAGGCT GGTGCATGAT ACATGGTGGA CTAGCCCATA TATTTACCCC 1800 
TCCCCCACCC ACTAACAAGT TTTTTTTATT AGGTCTTCAT CCTCTGATTT GTTTTTCTGT 18 6 0 
TAGCCCATTC TTCATCATGG ACTTATTAAT CATGATTAGT TTCTTGGATT TTTGTTTACT 19 2 0 
TGACTTGAAT TTGACAATGT GCCTCATATA TGGCATGTGG GACTGATAGG AAGATATATT 19 80 
CTCACAACAT TAACTTAAAA AGGATTATTT TTTTGGTGCA GTCGTAAAGA AAACTACTTT 2040 
CTTTTATGCT AAAAGTTATT CAAACATAGA TTTATAAACA AAGGATATCA CCATGCATGA 2100 
CCATGCGCTC TCTCATGTTT ACTCTAGAAA CCATATATCT CTTTGTTGCA AAATATTTAA 2160 
TCTATCCTCC TTGTTTCTGG GAATGAGTCG GGGAAGGTAA TCTTAGGGAA GGTTAAAGTG 222 0 
AGGCAAGTAA GAGCAACTCT AGCAGAGTCG CGATATGCCC AATCGCCATA ATGCCAATAT 22 8 0 
GGCATTTTTG GCCCAAAATG GCACTTCAGA AGAGTCACCA TATCCCTTCG GATAGCCATA 2 3 40 
ATTTAGGGAG CTCGCTCCAC AAACAAGCTT CGAGCCTCCA AATATGGAGG CCATGGATTC 2 4 00 
GTTGTTTGGC ACTCACTCCA TATCCAACCG CAAGCGCATG CATGAGGGAA GTTTTAGCTT 2460 
CTTCCTCCTT GCGCCAACGC CGGGATTTTA CACAGCGCAT TACAGGTACA TGAACCAGCA 2 52 0 
TGCACAGATA ATCACCGACG AGTGGGGTGA CAAGAAGGAT AAGCACCCTC CCATTAGTGG 2 5 80 
TGCGCCCACT CCCCTCAAAT TCATGAGGCA GCCATTTGGA TGGTCATCGC GTGGCATAAG 2 640 
CTCCGACTAT AAAATCTCAA CGGCATCACC AAAACCATAG CTGCCGCCTC CCCCTTCCTC 2 7 00 
GGCATCACCT CCCCAAGACA TCTCCTCCCC TCTATGCCAC AATGTCATCA TTATGGAGAG 2 7 60 
ACACAACTAC TGGTAAACCG CATACCCAAT CATGGTTTAC CGGCAGTGCG AACCCCACCT 2 82 0 
TCCTCCCACG ATGGTAGGAT ATTCTCCTCC TAGAATGGCG CGTGTGGCGC TTCCTCCTCC 2 8 80 
CGAGGCTGAT ATGTCGGCTC CCATGATGGC GTGCATCATT GATTTGGCGC TTCGGGTCCA 2 94 0 
TCATACATGT TAACGAGGTC ATCCCCATTG ATGTCGTTGG TCCCCTTGCC CCCCAGTCGG 3 000 
ATCCTGAGGA CCCGTTCGAT GTCGCAATGC GACTCTCCAA ACTCAAAGCT CACAATGAGG 3 06 0 
AGTACGTCCT CTAGGAGTTC CGCCCCGCAA CCATCTATAA GGAGGAGCAA CGATAGCTCT 312 0 
CCCCTACGCC TTCCTCGACG ATCTCTCTTA GGAGGACAAC GGCTAGACGA CGGCGGCGGC 3180 
GGCGAAGGTA CTGCAGGTAG TAGAACATAG CAATGTCGAA TGGCGACATT GCATATTTTG 3 2 40 
AAAATGTCGC TCAACGACTT TTGAAGTCGC AAATAAAATG TAGTGTGACT ACTTTTGGCC 3 3 00 
AGCAATATAA GTTTATCACA TTTGATAATG ATTTGAACCG GTGTGGTTCA ACTAAATGTA 3 3 60 
CCATAAATTG AAC AT AC AAA TTTTTAGCAA ATGAAAAAAG AAACAAGTAA GACCACAAAT 3 420 
ATGAAAGCCG CATATCGCGA CTATGTGTTT GAGCCGCAGC TGCCAAGTAC ATATGAAGCG 34 80 
TACTCCATAT GACATACGAC AACCATACAT ATGAAGACTC TACTAGAGTT CTCTAAGGCC 3 5 40 
GCTTTTAGCG CCTTTCGTGC AGTGGTGCCC ATAGGGAGTG AGGGTAGTTG GACTGTTCGT 3 6 00 
TTCCCCTTTT TTCATTTCTT TGAAATCTAT TTTATTTTTT TTCTCTTTTG TAGGTTTCCC 3 660 
AAATTTATAT ACCATTTTTC TGTTTCTCGC TATTTTTTGT TGTTATATTC TAGTTTCATA 372 0 
TTTTTCT ATT ATTAATTTGT GTCTCTTATG AGAAGTCCAG ACTTGCATAT GGAGGTGCAC 3 7 80 



BNSOOCfD: <W0>_9914314A1 JL> 



WO 99/14314 



PCT/AU98/00743 



- 71 - 



ACACAAACAT ATAAAGTATA AATACTAACT TGAGAAGTAT GTTTGCGTGG TCAAAAAAAC 3 340 

ATCATCAAAA CCTGCCAATA TGAGATATAG TTTTGAATAT ATCAATATGA GCAACGCAAC 3 900 

5 

CATTTAAAAT GTGAACAATT GTTTTTTTAG AAAAAATATA AGAAATAACT CCAACCCAGC 3 96 0 

CAAACCACAT GCTATACACT TGCTCCATAT GAAACCATGT TTGCTATTGG GCAGTTGCCT 4 02 0 

10 GAAACCGAAA GTAATGTTAG CCGTTTTTCT ATTCAAAGAA GAAGGAGAGT CGAGGTGACG 4 0 80 

CGATGCTTAG ACGTGAGATG GGGATGACCA CAACGTCCCT ACAGAGACCT CACCGGAGAT 414 0 

GGGGACATTG CAGTTGACAC GAGAGCGGTG AGGGGCTGCG ATGCGTGTGC GGCAACATGT 42 0 0 

15 

GGCGAGGCGG ACGTCGGGCT GGCAGGTAGG GGGGAGGGGG AAGGACCGGG GGAGGAAGAA 42 6 0 

GAGGAGTAGC CTGCAAAACA TGGTACACCA GTTTTCTGCC CTACGAAAAC CTCATTTCAT 43 2 0 

2 0 TCCCCCACCC TGACAAGCAA CAACCAACCA TCGCAGTCCC ACATGTCCCT CTGGTCTTTG 43 8 0 

CAAAAAGTAA TTGTTCTTGC TGGACAGCGC AAAGAGTAAA CTTTTGTTAG TTTTCATTTC 4 44 0 

TAGAAAAAGC AATCCTTTTA TAGTTCTTTT GTGAAAGTAA TGCTTTTATA GTGATTGGGA 450 0 

25 

TGTTCTTTTA GAGCAAATAT CTTCTTTTTT TTTTAGGGAA AAGAGCAAAT ATCTTCCACT 4 56 0 

TTTCACAAAA CTGACGAAGG CTGAAAGTGG CGAGACAGTG AGGGCCCATA GCTTTCGTCC 46 2 0 

3 0 GGCCCAGCGG CGCACGACCG TCCACGTGCA CCCCGGCGCT CCCGGGCCCG CAGATCCGTT 4 63 0 

CTCCCTCGCC CCCGTTTCCC CCTCCCTCCC TCTCGTTGCT TCCACTCCAC TGTTCTCCTC 47 40 

TTCCTGTCCA AAGCGGGCAC GGACCGG AAA AAAATCACGC CTTTCCGTTG GGTCTCCGGC 4 8 00 

35 

GCCACACTCC TCCTCCGGCC GATATAAAGC GCGCGGGGCC ACGGGCCCGG CGCAAAATGG 4 8 60 
GATTCCCGTC CGCCGCCATG GAGGAAGATG 4 89 0 

4 0 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6228 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

50 

Civ) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION:! 

60 (D) OTHER INFORMATION :/product- "coding region of wSBE I-D4 gene'' 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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ACGGGCCCGG CGCAAAATGG GATTCCCGTC 
CCGCCCCCTC CTGCTCGCCA TCTCTCCCGC 
5 CCGGACCGGG GATCTCGGTG AGTCAGTCGG 
CCGGCTCCGT TCTGCCGGGG TTTCCCTGAT 
ATGTGCGGCT GAGCGCGGTG CCCGCGCCCT 

10 

TGAGCCCTCT CCCCTGTCTA CCCAGATTTG 
ACGGAATCTG ATCCACGGTG GTTATTGGAA 
15 GGGATTCGTC CACTGAGGAA CAAGTGGATG 
GATCCGTACG CAGAATATCC CTCCTGCAGT 
AAATGTGTAT AATCTGTGCT GAATGTATCA 

20 

TCCTGTGTTG TGTCTCTACT ACTTGTTCAG 
TCATTTATGG AAGGCCAAGA GCAAGTTCTC 

2 5 CATGGCAACA GCTGAAGATG GTGTTGGCGA 

TGCCGGCTTC AAGGAACACT TCAGTTATAG 
GATTGAGAAG CACGAGGGAG GCCTTGAAGA 

30 

GTTTGAAACA ATAGTTACAT CTTGTGGCGT 
TTTTGTAGGC TATTTGAAGT TTGGGATCAA 

3 5 ATGGGCCCCT GCAGCAATGT AAGTTCTAGT 

GGTTAACTTA TGAAGTGCTG ATGAAACTGT 
TCTAGCTAGT AAAGAGTAGA TAAATATGAA 

40 

GGTTGGCTGG TATTCATTTC TTTTATGGCA 
CATGTATTTA CTTGTGAGTC ATTACTTTAT 
45 TTCAACAACT GGAATGGCTC TGGGCACAGG 
ATCAGGATTT CCCATGTCAA TGGGAAACCT 
CGATTTCACC GTGGAGATGG ACTATGGGTC 

50 

ACTTTTGATG CCTCTAAATT TGGAGCTCCA 
GGTGAAAGGT CTACTTTTAG TGGCTCGAGA 
5 5 AACTTACATT AATGTGGAGA CATGATACTT 
GCATCCTCGG CCTCGAAAGC CTGACGCTCC 
TGGTGAAAAG CCTGAAGTAA GCACATACAG 

60 

AAAGGCAAAC AACTACAACA CAGTTCAGCT 
TTCTTTTGGG TACCATGTGA CGAATTTCTT 
65 ACCTCAATAT CTTGTTGACA AGGCACATAG 
CCATAGCCAT GCGAGCAGTA ATAAGACAGA 



72 - 

CGCCGCCATC GACGAAGATG CTCTGCCTCA 60 
CGCGCCCCTC CCGTCCCGCT GCTGACCGGC 12 0 
GATCTTCATT TCTTTTCTTT TCTTTCGTTT 18 0 
GCGATGCCGC GCGCGCGCAG GGCGGCGGCA 2 40 
CTTCGCTCCG CTGGTCGTGG CCGCGGAAGG 300 
CGACCGTGAT CCCCTGTTGT CGCCGGGCAA 3 60 
ATAGTATATA CTACTAATAA ACTTGAGGCT 420 
CGATTTCGAT TGGATTTCTC TGCTTTATGC 4 80 
GTCTCAACCG TATTACTGGA TGTACAACCC 540 
ACCAATAATT GCTGCATTGT GAAAACATAA 600 
TCCTGATCTG CCGCTTATCC TAACTTTTGT 660 
TGTTCCCGTG TCTGCGCCAA GAGACTACAC 720 
CCTTCCGATA TACGATCTGG ATCCGAAGTT 7 80 
GATGAAAAAG TACCTTGACC AGAAACATTC 840 
GTTCTCTAAA GGTTAGCTTT TGTTTCATGT 9 00 
CCGCAGCACA AAAGACATAA TGCGACTCTG 9 60 
CACAGAAAAT GACGCAACTG TGTACCGGGA 1020 
GTTGTCACGC AACTAATTGC AATGGTCGTT 10 80 
CTTAAGAGTT TATGGCTTGT CTTTTCTGAT 1140 
ATATGTTTTC CCTTTTCTAG TTATGGTCAT 12 00 
ATACTTGCTT CTAACTATCT TT AG TAG ATT 12 60 
GGGTGTAGGG ATGCACAACT TATTGGTGAC 1320 
ATGACAAAGG ATAATTATGG TGTTTGGTCA 13 8 0 
GCCATCCCCC ATAATTCCAA GGTTAAATTT 144 0 
GATCGGGTTC CTGCATGGAT TCGTTATGCA 1500 
TATGACGGTG TTCACTGGGA TCCACCTTCT 156 0 
GCAAGAAATC TAAGTAAAAC CCACACAATT 162 0 
TTATTGCTCG TTTTGCAGGT ATGTGTTTAA 16 8 0 
ACGTATTTAC GAGGCTCATG TGGGGATGAG 174 0 
AGAATTTGCA GACAATGTGT TACCGCGCAT 18 00 
GATGGCAATC ATGGAACATT CATATTATGC 1860 
CGCAGTTAGC AGCAGATCAG AACGCCAGAG 19 2 0 
TTTACGGTTG CGTGTTCTGA TGGATGTTGT 1980 
TGGTCTTAAT GGCTATGATG TTGGGCAAAA 2040 
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CACACAGGAG TCCTATTTCC ACACAGGAGA 



CCTGTTCAAC TATGCCAATT GGGAGTCTTA 

5 

ATGGACGAAT TCATGTTTGA TGGCTTCCGA 



CACCATGGTA TCAATATGTC ATTCGCTGGA 



10 GATGTAGATG CAGTTGTTTA CCTGATGCTT 



GAAGCAACTG TTGTTGCAGA AGATGTTTCA 



GAAGGTGGAG TAGGGTTTGA CTATCGCCTG 

15 

TACTTGAAGA ACAAAGATGA CCTTGAATGG 



AACAGGAGAT ATACGGAAAA GTGCATTGCA 



2 0 CCCTCCTTTG TCGCTGTGCG TGAGTATGTG 



CATACAGTTC AAAGGTGAGA CACTTTCTTT 



TTCGCTTGAT GACTTTTAGT TGCTTCACAA 

25 

CTAGTGATAG TACCCACTAA CCAGCTATTA 



GTTATATATC GTTGACTTTG TGTTCATCTA 



3 0 AAATTTTC AG TCTATTGTTG GCGACAAGAC 



GTATACTGGC ATGTCAGACT TGCAGCCTGC 



TCAAAAGGTT CGATTCGTTT TAAGTATTCC 

35 

TTGTAATGTT CGTTGTTACT CAGAGTTCTG 



CCTTTTGGTA CATTTGGCTT ATTTTGTTAC 



4 0 TGGCCCTTGG AGGTGATGGC TACTTGAATT 



TGTCAAAACT TATTTCTGAT CAATATGTTT 



AGGGCGAAAA GTTTAAACAT CTGTTTTCTA 

45 

TGTTATCACG TATCATTTAG CTGTGCCGGT 



TTTTAGCGTG GCAGTCTATT GTTGGATCCT 



50 CACACTTATG AATATTCCCT GTTTAAAAGA 



ATGATGCAAA CATGATAGAG ATGTTAGCAT 



TCACGACAAG CTTCTTGCAG AAAATC AGGA 

55 

GTTTATATCT GTTTTCTAAC TCATACTGAC 



GAATGGATTG ACTTTCCAGA AGAAGGCAAC 



60 TGGAGCCTCG C AG AC ATTG A TCACCTACGA 



GTTTCTGGTC TGGTAGCTCT CTTGGGATCT 



GTAGCTTATT TACACTGTGT TCCAACTTCT 

65 

TTCATATTAA GCCTTTCAAA CTAAACTAAA 
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AAGGGGCTAT CATAAACTGT GGGATAGCCG 2100 
CGATTTCTTC TTTCTAATCT GAGATATTGG 2160 
TTTGATGGGG TAACATCCAT GCTATATAAT 2 22 0 
AGTTACAAGG AATATTTTGG TTTGGATACT 2 2 80 
GCGAACCATT TAATGCACAA ACTCTTGCCA 2 3 40 
GGCATGCCAG TGCTTTGTCG GTCAGTTGAT 24 00 
GCTATGGCTA TTCCTGATAG ATGGATCGAC 2 4 60 
TCAATGAGTG GAATAGCACA TACTCTGACC 2 52 0 
TATGCTGAGA GCCATGATCA GGTATGTTTT 2 5 80 
TTCTTTTTTT ATGGGGCACT GGTCTAAGAA 2 64 0 
GCCTGGTAGA CAAATTTGAG AAATAAACAT 27 0 0 
GTTCGAATTA AGTTAGTTAT ATTCTGATAA 27 6 0 
CGGACCATGT AAGAATGTCC GAAGACTGCA 2 82 0 
TTGAAACAAC TTAGTAGTTA ACTTTCACGC 2 88 0 
TATGGCATTT CTCTTGATGG ACAAGGAAAT 29 4 0 
TTCGCCTACA ATTGATCGTG GAATTGCACT 3 000 
TGAATTTGAT GTTCTAGTTC CAGACGAGTA 3 06 0 
CTTAGTCCTT GAAGATAATG TATTCCAGTC 312 0 
AAAT ATTTC A GATGATTCAC TTCATCACCA 3180 
TTATGGGTAA TGAGGTAATA TCTGGTTATC 3 240 
CGGGATTCCC TCGAAAAAAA TCCTTTGGGC 3 3 00 
TGATAGCCAA GTACTCCCCA GCTATTTCCA 3 3 60 
AGTTAATCTT TATTCTAATT CATTGTTGTT 3 42 0 
CTTATTCCAA TTACATATAT GCCGACATCA 3 48 0 
TTTTTATTTT ATACCAATGT TTCTCCGTAA 3 54 0 
GTCTTTCTTA ACCTACTCAT GTTTTACATA 3 60 0 
GTATATGGCA AATTGCTGCA ACCTGACAAC 3 66 0 
GGTGCAATTT CCTTTTAGTT TGGCCACCCA 3 72 0 
AACTGGAGTT ATGATAAATG CAGACGCCAG 3 78 0 
TACAAGGTTA TGCCTATGTA TATTTTTACA 3 84 0 
TGACCTCACT TAGTTCCTTC ATCTCTGACT 3 900 
GTCTTGTGGA TAAATTCTCC CTTCTAACGT 3 960 
TTGCTGATCT ACTACTAGTT GCTCAGTACG 4 02 0 
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ATGACCAAAT CTTGCCTGTG GTAACCTAGT AATTTTCTTG ATTCTTACAC ATTAGTGATA 4 08 0 
TGCAGTGCAT ACATTATCCA TATAAATTGA CATTGCAATT TCCCAAATAT TATTTGAAGG 4140 
5 CTGTGTTCTT TTGTTAACAG GAAGTTATTT TCTCTGCATC TGATAAATAA TAATAGCCTT 42 00 
TCACGATTTT TCTCATATTT TATCCAACTT TTCTGCATTC AAGCATTTTT TGTTTCTCGC 4 2 60 
CTAACATATA TAATTTGAAC AGTACATGAA CGCATTTGAT CAAGCAATGA ATGCGCTCGA 43 2 0 

10 

CGACAAATTT TCCTTCCTAT CATCATCAAA GCAGATTGTC AGCGACATGA ATGAGGAAAA 43 8 0 
GAAGTAGTTA ACTATACAAT GTTTAGTCAG GGCAGCTGTT GCATCATTTG ATTCACTCCT 44 40 
15 ACTCTTAAGA ATAGCAACTC TGACTTGTGC GTTTTATGTT ACCAAATAAG TTGAAACCGT 4 500 
ATCTGTTTGA TATGAACCAT TGTTGTCTCA AAATGGGCTA TGGACTCAAT CCAACTTCCT 45 60 
TTCCAGATTA TTGTATTTGA ACGTGGAATC TGGTCTTCGT CTTCAATTTT CATCCCAGTA 4620 

20 

AAACTTATGA TGGGTAACTG ATCTCTTGCA AGCTTTGCCT TTCAATATTT CTTCTGCTTA 4680 
ATGACTAATG TGCTTAATCT CGTTTCCACT TTTAAAACAC GCAGTTACAA AGTCGGATGT 47 40 

2 5 GACTTGCCTG GGAAGTACAA GGTAGCTCTG GACTCTGATG CTCTGATGTT TGGTGGACAT 48 00 

GGAAGAGTAA GCAATGTTAA TGATGTTCAA GATCTGTTTT GCAACACTAT GTTCTTCTAT 48 60 
AGAAGGGGCC ATCAAGGCTG CATCAGATAA TCTTATTTGG AGTGTTGATC TGTGCTGCAT 49 2 0 

30 

CGCAGGTGGC CCATGACAAC GATCACTTTA CGTCACCTGA AGGAGTACCA GGAGTACCTG 49 80 
AAACAAACTT CAACAACCGC GCTAACTCAT TCAAAATCCT GTCTCCATCC CGCACTTGTG 504 0 

3 5 TGGTAATGCT AATTACTAGG AGGATTTAGT AACAATAAAT AAATAACAGC AAAAGATATC 5100 

TGCAGTACGA TCTCACAAAA TGCTCTCTTG CCAGGCTTAC TATCGCGTCG AGGAGAAAGC 5160 
GGAAAAGCCC AAGGATGAAG GAGCTGCTTT CTTGGGGGAA ACTGCTCTCG GGTACATCGA 52 2 0 

40 

TGTTGAAGCC ACTGGCGTCA AAGACGCAGC AGATGGTGAG GCGACTTCTG GTTCCGAAAA 52 8 0 
GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC TTTGTCTTTC TGTCACCCGA 53 4 0 
45 CAAAGACAAC AAATAAGCAC CATATCAACG CTTGATCAGG ACCGTGTGCC GACGTCCTTG 5400 
TAATACTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA CTGTGCAGAC TTGAAATTCT 54 6 0 
GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT AAATAAGCGG TGATGGTGCG 552 0 

50 

GGTCGAGTCC AGCTATATGT GCCAAATATG CGCCATCCCG AGTCCTCTGT CATAAAGAAA 5 580 
GTTTCGGGCT TCCATCCCAG AATAAAAACA GTTGTCTGTT TGCAATTTCT TTTTGTCTTG 5640 
55 C AT AG TT AC A TGATAATTGA TGCATATTGC TATAAGCCTG GATTGCATCT TCTTTTGCTA 5700 
ATAACTGCAG GGCCAAGAAA GCCTAGATTG TATCTTTTTT TGCTAATAAC TGCAGTGCTG 5760 
GGGAAGCTTC AGTCCTTGTT TCCGTTCTCG AGACAAGGCG TCATGTTTGG CGCACAAAGG 582 0 

60 

TAAGCCATCA TCTTATCAAG TCCCAAAATT CTCTGGTTGA AAGAAACCAT CACTAACTTG 5880 
TTCCAGGTGT TGGTTCCTCC ACAACCAAAA GGCGACCATC GTCGTCATCA TCGCTCACAG 594 0 
6 5 CACTGACCAT CGAAGCCACG GTGGGCATGA AATGCGCATC GCCCAAGACT TGGGACCGTT 600 0 
TCAAAATATC ACAAACTGCC ATGGCATCTT CTGCCAAAGG CTGCACTGCA CCTTTGGCAT 6 06 0 
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GAACAGAAGC AACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGGCC 6120 

GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTTCTTG 6180 

5 

CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTCACACG GAGCCATT 62 2 8 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 
1 0 (A) LENGTH: 1 1463 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 
(ui) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: 

20 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: L. 1 1463 

(D) OTHER INFORMATION.7product= "complete sequence of the 
starch branching enzyme II gene" 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGAAACACCT CCATTTTAGA TTTTTTTTTT GTTCTTTTCG GACGGTGGGT CGTGGAGAGA 60 

3 5 TTAGCGTCTA GTTTTCTTAA AAGAACAGGC CATTTAGGCC CTGCTTTACA AAAGGCTCAA 12 0 

CCAGTCCAAA ACGTCTGCTA GGATCACCAG CTGCAAAGTT AAGCGCGAGA CCACCAAAAG 180 

AGGCGCATTC GAACTGGACA GACGCTCACG CAGGAGCCCA GCACCACAGG CTTGAGCCTG 2 40 

40 

ACAGCGGACG TGAGTGCGTG ACACATGGGG TCATCTATGG GCGTCGGAGC AAGGAAGAGA 3 00 

GACGCACATG AACACCATGA TGATGCTATC AGGCCTGATG GAGGGAGCAA CCATGCACCT 3 60 

45 TTTCCCCTCT GGAAATTCAT AGCTCACACT TTTTTTTAAT GGAAGCAAGA GTTGGCAAAC 42 0 

ACATGCATTT TCAAACAAGG AAAATTAATT CTCAAACCAC CATGACATGC AATTCTCAAA 480 

CCATGCACCG ACGAGTCCAT GCGAGGTGGA AACGAAGAAC TGAAAATGAA CATCCCAGTT 54 0 

50 

GTCGAGTCGA GAAGAGGATG ACACTGAAAG TATGCGTATT ACGATTTCAT TTACATACAT 600 

G T AC AAAT AC AT AATGT AC C CTACAATTTG TTTTTTGGAG CAGAGTGGTG TGGTCTTTTT 660 

5 5 TTTTTACACG AAAATGCCAT AGCTGGCCCG CATGCGTGCA GATCGGATGA TCGGTCGGAG 720 

ACGACGGACA ATCAGACACT CACCAACTGC TTTTGTCTGG GACACAATAA ATGTTTTTGT 7 80 

AAACAAAATA AATACTTATA AACGAGGGTA CTAGAGGCCG CTAACGGCAT GGCCAGGTAA 84 0 

60 

ACGCGCTCCC AGCCGTTGGT TTGCGATCTC GTCCTCCCGC ACGCAGCGTC GCCTCCACCG 900 
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TCCGTCCGTC GCTGCCACCT CTGCTGTGCG CGCGCACGAA GGGAGGAAGA ACGAACGCCG 9 60 

CACACACACT CACACACGGC ACACTCCCCG TGGGTCCCCT TTCCGGCTTG GCGTCTATCT 10 2 0 

5 CCTCTCCCCC GCCCATCCCC ATGCACTGCA CCGTACCCGC CAGCTTCCAC CCCCGCCGCA 1080 

CACGTTGCTC CCCCTTCTCA TCGCTTCTCA ATTAATATCT CCATCACTCG GGTTCCGCGC 1140 

TGCATTTCGG CCGGCGGGTT GAGTGAGATC TGGGCGACTG GCTGACTCAA TCACTACGCG 12 0 0 

10 

GGGATGGCGA CGTTCGCGGT GTCCGGCGCG ACTCTCGGTG TGGCGCGGGC CGGCGTCGGA 12 6 0 

GTGGCGCGGG CCGGCTCGGA GCGGAGGGGC GGGGCGGACT TGCCGTCGCT GCTCCTCAGG 132 0 

15 AAGAAGGACT CCTCTCGTAC GCCTCGCTCT CTCGAATCTC CCCCGTCTGG CTTTGGCTCC 13 8 0 

CCTTCTCTCT CCTCTGCGCG CGCATGGCCT GTTCGATGCT GTTCCCCAAT TGATCTCCAT 1440 

GAGTGAGAGA GATAGCTGGA TTAGGCGATC GCGCTTCCTG AACCTGTATT TTTTCCCCCG 150 0 

20 

CGGGGAAATG CGTTAGTGTC ACCCAGGCCC TGGTGTTACC ACGGCTTTGA TCATTCCTCG 15 60 

TTTCATTCTG ATATATATTT TCTCATTCTT TTTCTTCCTG TTCTTGCTGT AACTGCAAGT 162 0 

2 5 TGTGGCGTTT TTTCACTATT GTAGTCATCC TTGCATTTTG CAGGCGCCGT CCTGAGCCGC 16 80 

GCGGCCTCTC CAGGGAAGGT CCTGGTGCCT GACGGCGAGA GGACGACTTG GCAAGTCCGG 17 40 

CGCAACCTGA AGAATTACAG GTACACACAC TCGTGCCGGT AAATCTTCAT ACAATCGTTA 18 00 

30 

TTCACTTACC AAATGCCGGA TGAAACCAAC CACGGATGCG TCAGGTTTCG AGCTTCTTCT 18 60 

ATCAGCATTG TGCAGTACTG CACTGCCTTG TTCATTTTGT TAGCCTTGGC CCCGTGCTGG 19 20 

3 5 CTCTTGGGCC ACTGAAAAAA TCAGATGGAT GTGCATTCTA GCAAGAACTT CACAACATAA 19 80 

TGCACCGTTT GGGGTTTCGT CAGTCTGCTC TACAATTGCT ATTTTTCGTG CTGTAGATAC 2 040 

CTGAAGATAT CGAGGAGCAA ACGGCGGAAG TGAACATGAC AGGGGGGACT GCAGAGAAAC 2100 

40 

TTCAATCTTC AGAACCGACT CAGGGCATTG TGGAAACAAT CACTGATGGT GTAACCAAAG 2160 

GAGTTAAGGA ACTAGTCGTG GGGGAGAAAC CGCGAGTTGT CCCAAAACCA GGAGATGGGC 2220 

4 5 AGAAAATATA CGAGATTGAC CCAACACTGA AAGATTTTCG GAGCCATCTT GACTACCGGT 22 80 

AATGCCTACC CGCTGCTTTC GCTCATTTTG AATTAAGGTC CTTTCATCAT GCAAATTTGG 23 40 

GGAACATCAA AGAGACAAAG ACTAGGGACC ACCATTTCAT ACAGATCCCT TCGTGGTCTG 2 400 

50 

AGAATATGCT GGGAAGTAAA TGTATAATTG ATGGCTACAA TTTGCTCAAA ATTGCAATAC 2 460 

GAATAACTGT CTCCGATCAT TACAATTAAA GAGTGGCAAA CTGATGAAAA TGTGGTGGAT 2 52 0 

55 GGGTTATAGA TTTTACTTTG CTAATTCCTC TACCAAATTC CTAGGGGGGA AATCTACCAG 2 58 0 

TTGGGAAACT TAGTTTCTTA TCTTTGTGGC CTTTTTGTTT TGGGGAAAAC ACATTGCTAA 2 640 

ATTCGAATGA TTTTGGGTAT ACCTCGGTGG ATTCAACAGA TACAGCGAAT ACAAGAGAAT 2 700 

60 

TCGTGCTGCT ATTGACCAAC ATGAAGGTGG ATTGGAAGCA TTTTCTCGTG GTTATGAAAA 27 60 

GCTTGGATTT ACCCGCAGGT AAATTTAAAG CTTTATTATT ATGAAACGCC TCCACTAGTC 2 8 20 

65 TAATTGCATA TCTTATAAGA AAATTTATAA TTCCTGTTTT CCCCTCTCTT TTTTCCAGTG 2 880 

CTGAAGGTAT CGTCTAATTG CATATCTTAT AAGAAAATTT ATATTCCTGT TTTCCCCTAT 2 940 
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TTTCCAGTGC TGAAGGTATC ACTTACCGAG 



TTTTAAGTTC CTTAACGAGA CACCTTCCAA 

5 

AGCTTACTGG ACTTACAAAT TAGCTTACTG 



CTGGCTTTTG CACCCTGTTA CAGTCTGCAG 



10 CAAATGCAGA TACTATGACC AGAGTATGTC 



ATAACTACTG ATACATCTAT TTGTATTTAT 



CCTCAACTAC ATCATATCAA AATGGTATAA 

15 

ATTCTACTGA ATTTAGTCCA TCTTTTTGAG 



ATACGTGCAA CACTCCCATC TGCATTATGT 



2 0 CATGCTATCA GTGAAGGTTT GCTCCTATTG 



TGATTATGGT GTTTGGGAGA TTTTCCTCCC 



TCATGGCTCA CGTGTAAAGG TAAGCTGGCC 

25 

AACTCTGCCT ACTAAGGGTC CCTTTTCCTC 



TCCGGTGTGA AGGATTCAAT TTCTGCTTGG 



3 0 ATACCTTTCA ATGGCATATA TTATGATCCA 



ATTATTAAAT GAAATTTCCA GTGTTACAGT 



AGTCAAGACA ATACTTTTGA ATTTGGAAGT 

35 

TAAGGGGCAA CCAACCTTGG TGATGTGTGT 



TCTTTTATGT GTTCTCTGTT GGTTAGGATA 



4 0 AAGGATATTT ACATGCAAAT GCAGGAGAAG 



AGAGTCACTA AGGATTTATG AATCACACAT 



TATTTCACCT GTTTCTGGTC TGATGGTTTA 

45 

TGTTAACATA TTACATGGTG CATTCACTTG 



ATATTGGGAA GTGCAAAACT TTGCTTCCTC 



50 GATTTCCATT GCATTTGGAG GCAGTGGGCA 



AGAGCATAGT TATATGAATT GGATTGTTGT 



ACTAGCTTAA GATTTCCCAC TTAGGATGTA 

55 

GCCATTTCCT ACCTTATTAA TGAGAGAGAG 



TCATTATTCT GCGAGCGATT CAAAAACTTC 



60 T.CTCCCATTA TGAAGAGGAT ATAGTTAATT 



GAGGCATCGC TAATATATAC TATCATCACA 



GTGATCTTGC ACAGGAACCG AAGATAAATT 

65 

CAAGAATTAA AAGGCTTGGA TACAATGCAG 
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AATGGGCTCC CTGGAGCGCA TGTTATGTTC 3 000 
TTTATTGTTA ATGGTCACTA TTCACCAACT 3 060 
AATACTGACC AGTTACTATA AATTTATGAT 3120 
CATTAGTAGG TGACTTCAAC AATTGGAATC 3180 
TACAGCTTGG CAATTTTCCA CCTTTGCTTC 3 2 40 
TTAGCTGTTT GCACATTCCT TAAAGTTGAG 3 3 00 
TTTGTCAGTG TCTTAAGCTT CAGCCCAAAG 3 3 60 
ATTGAAAATG AGTATATTAA GGATGAATGA 3 42 0 
GTGCTTTTCC ATCTACAATG AGGATATTTC 3 4 80 
ATGCAGATAT TTGATATGGT CTTTTCAGGA 3 540 
TAACAACGCT GATGGATCCT CAGCTATTCC 3 600 
AATTATTTAG TCGAGGATGT AGCATTTTCG 3 660 
TCTGTTTTTT AGATACGGAT GGATACTCCA 3 72 0 
ATCAAGTTCT CTGTGCAGGC TCCAGGTGAA 37 80 
CCTGAAGAGG TAAGTATCGA TCTACATTAC 3 84 0 
TTTTTAATAC CCACTTCTTA CTGACATGTG 3 900 
GACATATGCA TTAATTCACC TTCTAAGGGC 3 9 60 
ATGCTTGTGT GTGACATAAG ATCTTATAGC 402 0 
TTCCATTTTG GCCTTTTGTG ACCATTTACT 4080 
TATGTCTTCC AACATCTCAA CTAAACGACC 4140 
TGGAATGAGC AGCCCGGTAT GTCAATAAGT 42 00 
TTCTATGGAT TTTCTAGTTC TGTTATGTAC 42 6 0 
ACAACCTCGA TTTTATTTTC TAATGTCTTC 43 20 
TTTGTCTGCT TGTTCTTTTG TCTTCTGTAA 43 80 
TGTGAAAGTC ATATCTATTT TTTTTTTGTC 44 40 
TGCAATAGCT CGGTATAATG TAACCATGTT 4 5 00 
AGAAATATTG CATTGGAGCG TCTCCAGCAA 4560 
ACAAGGGGGG GGGGGGGGGG GGGGTTCCCT 4 620 
CATTGTTCTG AGGTGTACGT ACTGCAGGGA 4680 
CTTTGTAACC TACTTGGAAA CTTGAGTCTT 47 4 0 
ATACTTAGAG GATGCATCTG AAATTTTAGT 4 800 
CATATGCTAA TTTTAGGGAT GAGGTGTTGC 4 8 60 
TGCAGATAAT GGCAATCCAG GAGCATTCAT 49 2 0 
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ACTATGCAAG CTTTGGGTAT TCACACAATC 
TTGGAGCTAT TACATCCTAA TGCTTCATGC 
5 AGATATATAG TACAACTACA CTTAGTATTC 
GTTCCAGGTA CCATGTTACT AATTTTTTTG 
ACTTAAAATC CTTGATCGAT AGAGCACATG 

10 

TTCATAGGTA ATTAGTCCAA TTTAATTTTA 
GGGAAATTCA GGCAATTATG ATACATTGTC 
15 AAAATCTAGA GTGGCATAAG GAAAATTGGC 
CCATCCTAAA TGGCAGGGCC CTATCGCCGA 
GTGACTTCTT TTTTCTCAGA TGTATTAAAC 

20 

TAGTAAACTG ACAGTTCCAT AGAATATGGT 
GATGTGGATT GAGAAGTTCA GATGCTATCA 

2 5 GGCACTACAT ATAGTTTGCA AGTTGGAAAA 

AGGCCCCACT TGCCAGCTTC ATACTAGATG 
TTACTTAAAG TTCTTCATTT GTCCTAAGTC 

30 

G AAAAT AT AT CAACATCTAC AACACCAAAT 
ATTATATTAG CACATCTTTG ATGTTGTAGA 

3 5 AATATAGAGA AGTTTGACTT AGGACAAATC 

ACATCAAATA AT AT AG AT AG ATGTCAACAC 
ATATGCATCA GACCATCTGT TTGCTTTAGC 

40 

TAATCTACTT TTCCTTCTAC TTGGTTTGGT 
ATGATTTTGT GTACCCTGCA GTCATTCGTC 

4 5 CGATGGCACT GATACACATT ACTTCCACGG 

TTCTCGTCTA TTCAACTATG GGAGTTGGGA 
TTGGCTAACT GTTCCTGTTA ATCTGTTCTT 

50 

TATTGAGATT CTTACTGTCA AACGCGAGAT 
TTCGATTTGA TGGGGTGACC TCCATGATGT 
55 AAGTGGTTTC AGTAACTTTT TTAGGGCACT 
CATGATCAGG ACTTGTGCTA CGGAGTCTTA 
ACCTGATGAG ATCATGGAAG ATTGGAAGTG 

60 

TCTTGTTCTA GATGACATTT ACTGGGAACT 
TTGATGCGGT AGTTTACTTG ATGCTGGTCA 
65 CTGTATCCAT TGGTGAAGAT GTAAGTGCTT 
AGTTTTATTT TGGGGATCAG TCTGTTACAC 



PCT/AU98/00743 

78 - 

CATTTTTTTC TGTATACACT CTTCACCCAT 4 9 80 
AC AT AAAAT A TTTGGATATA ATCCTTTATT 504 0 
TGAAAAAGAT CATTTTATTG TTGTTGGCTT 5100 
CACCAAGTAG CCGTTTTGGA ACTCCAGAGG 5160 
AGCTTGGTTT GCTTGTTCTT ATGGATATTG 52 2 0 
GCTGTTTTAC TGTTTATCTG GTATTCTAAA 5 2 80 
AAAAGCTAAG AGTGGCGAAA GTGAAATGTC 5 3 40 
AAAAACTAGA GTGGCAAAAA TAAAATTTTC 5400 
ATATTTTTCC ATTCTATATA ATTGTGCTAC 54 60 
CAGTTGGACA TGAAATGTAT TTGGTACATG 552 0 
TTTGTAATGG CAACACAATT TGATGCCATA 55 80 
ATAGAATTAA TCAACTGGCC ATGTACTCGT 5640 
CTGACAGCAA TACCTCACTG ATAAGTGGCC 57 00 
TTACTTCCCT GTTGAATTCA TTTGAACATA 57 6 0 
AAACTTCTTT AAGTTTGACC AAGTCTATTG 5820 
TACTTTGATC AGATTAACAA TTTTTATTTT 5880 
TATCAGCACA TTTTTCTATA GACTTGGTCA 59 4 0 
TAGAACTTCA ATCAATTTGG ATCAGAGGGA 6000 
TTCAACAAAA AAATCAGACC TTGTCACCAT 6060 
CACTTGCTTT CATATTTATG TGTTTGTACC 612 0 
TGATTCTATT TCAGTTGCAT TGCTTCATCA 618 0 
AAATAATACC CTTGACGGTT TGAATGGTTT 62 4 0 
TGGTCCACGC GGCCATCATT GGATGTGGGA 63 00 
AGTATGTAGC TCTGACTTCT GTCACCATAT 63 60 
ACACATGTTG ATATTCTATT CTTATGCAGG 642 0 
GGTGGCTTGA AGAATATAAG TTTGATGGAT 6 4 80 
ATACTCACCA TGGATTACAA GTAAGTCATC 6 54 0 
GAAACAATTG CTATGCATCA TAACATGTAT 660 0 
GATAGTTCCC TAGTATGCTT GTACAATTTT 6 66 0 
ATTATTATTT ATTTTCTTTC TAAGTTTGTT 67 2 0 
ATGGCGAATA TTTTGGATTT GCTACTGATG 67 8 0 
ACGATCTAAT TCATGGACTT TATCCTGATG 6 84 0 
ACAGTATTTA TGATTTTTAA CTAGTTAAGT 69 0 0 
TTTTTGTTAG GGGTAAAATC TCTCTTTTCA 69 6 0 
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TAACAATGCT AATTTATACC TTGTATGATA 



GGGCATTCAA GCTTACGAGC ATATTTTTTG 

5 

TTTGGGTTTT TCAATAAGTG GGAGTGTGTG 



AAGAAATGGG CAACCTTGTC AATTGCTTCA 



10 GGAAATGAGA GGCTATTCCC AAGG AC ATG A 



TTGTAATAGT GGTTTAACTT AAATTCCTGC 



TGTACACATC CACAAACAAG TAATCCTGAG 

15 

TCTGCCAGCA TTAACTGTTC ACAGTTCTAA 



TGGAATGCCT ACATTTTGCA TCCCTGTTCC 



2 0 GCATATGGCT GTAGCAGATA AATGGATTGA 



ATTACATGCG CACAATGATC TAGATTACAT 



GTGAATATCT AGACATTTGC CTGTTATCAG 

25 

AAATAGCAAA TCTCGGAAAT GTAATGGCTA 



CTGTAGCAGG CCAGTCAACA CAGTTAGCAA 



3 0 TATATGAGAA AGTTAGTATA TAAACTGTGG 



TAAGGATGGG CAGTAGGTAA TAAATTTAGC 



AAAGGAATAT ACAGGGTCAT GTAGCATATC 

35 

GGCTCGGTAA AAAAAACTTT ATGATGATCC 



CAAATACTTA TTGCTACTAC ACAGCTGCCA 



40 TATTTAGATT TAAATACTAA CTCGATACAT 



TTTGGTGGAT ACCAGAATTT CTGCCCTCTT 



CTCTGCCGTT ACAAAAGCTG TTTTCAGTTT 

45 

TTAAGCATGT TTTTTGAAGC TGTGAGCTGT 



AATATGCTGC AGTGTAATTT AGCATTTCTT 



50 ATGGGCGATA TTGTGCACAC CCTAACAAAT 



GCAGAAAGTC ATGATCAAGC ACTAGTTGGT 



AAGGTACTAG CTGTTACTTT TGGACAAAAG 

55 

CTTTGTAGAG ATTCCACTAT GGACCACATA 



ACTCATTTTG CTTCGTATGT AGTCCATAGT 



60 CGGAGGGAGT ACATAATTGA TTTGTCTCAT 



TTGGCTGCCT CACCCATCAC CAGCTATTTC 



AACGTACCAT GTGGTACTGT GGCGGCTTGT 

65 

TTCTTATTTA TTTGATTGCT TATGTTACCG 
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ATGCATCACT TAGTAATTTG AAAAGTGCAA 7 02 0 
ATGGCTGTAA TTTATTTGAT AGTATGCTTG 7 0 80 
ACTAATGTTG TATTATTTAT TTAATTGCGG 714 0 
GAAGGCTAAC TTTGATTCCA TAAACGCTTT 7 2 00 
ATTATACTTC AGTGTGTTCT GTACATGTAT 72 60 
ACTGCTATGG AATCTCACTG TATGTTGTAG 73 20 
CTTTCAACTC ATGAGAAAAT AGAGTCCGCT 73 30 
TTTGTGTAAC TGTGAAATTG TTCAGGTCAG 7 4 40 
AGATGGTGGT GTTGGTTTTG ACTACCGCCT 7 5 00 
ACTCCTCAAG TAAGTGCAGG AATATTGGTG 75 60 
TTTCTAAATG GTAAAAAGGA AAATATGTAT 7 62 0 
CTTGAATACG AGAAGTCAAA TACATGATTT 7 6 80 
GTGTCTTTAT GCTGGGCAGT GTACATTGCG 77 40 
TATTTTCAGA AACAATATTA TTTATATCCG 7 8 00 
TCATTAATTG TGTTCACCTT TTGTCCTGTT 78 60 
CAGATAAAAT AAATCG TTAT TAGGTTTACA 79 20 
TAGTTGTAAT TAATGAAAAG GCTGACAAAA 79 80 
AGATAGATAT GCAGGAACGC GACTAAAGCT 8 040 
ATCTGTCATG ATCTGTGTTC TGCTTTGTGC 8100 
TGGCAATAAT AAACTTAACT ATTCAACCAA 8160 
GTTAGTAATG ATGTGCTCCC TGCTGCTGTT 82 2 0 
TTTGCATCAT TATTTTTGTG TGTGAGTAGT 82 8 0 
TGGTACTTAA TACATTCTTG GAAGTGTCCA 8 3 40 
TAACACAGGC AAAGTGACGA ATCTTGGAAA 8 400 
AGAAGGTGGC TTGAGAAGTG TGTAACTTAT 84 6 0 
GACAAGACTA TTGCATTCTG GTTGATGGAT 8 52 0 
AATTACTCCC TCCCGTTCCT AAATATAAGT 8 580 
GTATATAGAT GCATTTTAGA GTGTAGATTC 864 0 
GAAATCTCTA CAGAGACTTA TATTTAGGAA 87 00 
CAGATTGCTA GTGTTTTCTT GTGATAAAGA 8 76 0 
CCAACTGTTA CTTGAGCAGA ATTTGCTGAA 8 82 0 
GAACTTTGAC AGTTATGTTG CAATTTTCTG 8 8 80 
TTCATTTGCT CATTCCTTTC CGAGACCAGC 894 0 
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CAAAGTCACG TGTTAGCTGT GTGATCTGTT ATCTGAATCT TGAGCAAATT TTATTAATAG 9000 
GCTAAAATCC AACGAATTAT TTGCTTGAAT TTAAATATAC AGACGTATAG TCACCTGGCT 9060 
5 CTTTCTTAGA TGATTACCAT AGTGCCTGAA GGCTG AAATA GTTTTGGTGT TTCTTGGATG 912 0 
CCGCCTAAAG GAGTGATTTT TATTGGATAG ATTCCTGGCC GAGTCTTCGT TACAACATAA 9180 
CATTTTGGAG ATATGCTTAG TAACAGCTCT GGGAAGTTTG GTCACAAGTC TGCATCTACA 9 2 40 

10 

CGCTCGTTGA GGTTTTATTA TGGCGCCATC TTTGTAACTA GTGGCACCTG TAAGGAAACA 9 3 00 
CATTCAAAAG GAAAGGGTCA CATCATTCTA ATCAGGACCA CCATACTAAG AGGAAGATTC 93 6 0 
15 TGTTCCAATT TTATGAGTTT TTGGGACTCC AAAGGGAACA AAAGTGTCTC ATATTGTGCT 9 42 0 
TATAAGTACA GTTGTTTTTA TACCAGTGTA GTTTTATTCC AGGACAGTTG ATAGTTGGTA 9480 
CTGTGCTGTA AATTATTTAT CCGAC ATAGA ACAGCATGAA CATATCAAGC TCTCTTTGTG 9 54 0 

20 

GAGGATATGT ATGATTTCAT GGCTCTGGAT AGGCTTCAAC TGTTCGCATT GATGGTGGGA 9 6 00 
TAGCATTACA TAAAATGATC AGGCTTGTCA CCATGGGTTT AGGTGGTGAA GGCTATCTTA 9 66 0 

2 5 ACTTGATGGG AAATGAGTTT GGGGATCCTG GTCAGTGTTT ACAACATTAT TGCATTCTGG 9720 

ATGATTGTGA TTTAGTGTAA TTTGAACGAT GCTTTTCTTT CACATTGTAT GTATTATGTA 97 8 0 
ATCTGTTGCT TCCAAGGAGG AAGTTAACTT CTATTTAGTT GGGAGAATGG ATAGATTTTC 9 84 0 

30 

CAAGAGGCCC ACAAACTCTT CCAACCGGCA AAGTTCTGCC CTGGAAATAA GAATAGTTAT 990 0 
GATAAATGCG GCCGTAGATT TGATCTTGTA AGTTTTAGCT GTGGTATTAC ATTCCCTCAC 99 6 0 

3 5 TAGATCTTTA TTGGCCATTT ATTTGTTGAT GAAATGATAA TGTTTGTTAG GAAAGATCAA 10020 

CATTGCTTTT GTAGTTTTGT AGACGTTAAC ATAAGTATGT GTTGAGAGTT GTTGATCATT 100 80 
AAAAATATCA TGATTTTTTG CAGGGAGATG CAGATTTTCT TAGATATCGT GGTATGCAAG 10140 

40 

AGTTCGATCA GGCAATGCAG CATCTTGAGG AAAAATATGG GGTATGTCAC TGGTTTGTCT 10200 
TTGTTGCATA ACAAGTCACA GTTTAACGTC AGTCTCTTCA AGTGGTAAAA AAAGTGTAGA 10 260 

4 5 ATTAATTCCT GTAATGAGAT GAAAACTGTG CAAAGGCGGA GCTGGAATTG CTTTTCACCA 10 3 20 

AAACTATTTT CTTAAGTGCT TGTGTATTGA TACATATACC AGCACTGACA ATGTAACTGC 10 3 80 
AGTTTATGAC ATCTGAGCAC CAGTATGTTT CACGGAAACA TGAGGAAGAT AAGGTGATCA 10440 

50 

TCCTCAAAAG AGGAGATTTG GTATTTGTTT TCAACTTCCA CTGGAGCAAT AGCTTTTTTG 10 500 
ACTACCGTGT TGGGTGTTCC AAGCCTGGGA AGT AC AAGGT ATGCTTGCCT TTTCATTGTC 10560 
55 CACCCTTCAC CAGTAGGGTT AGTGGGGGCT TCTACAACTT TTAATTCCAC ATGGATAGAG 10 620 
TTTGTTGGTC GTGCAGCTAT CAATATAAAG AATAGGGTAA TTTGTAAAGA AAAGAATTTG 10680 
CTCGAGCTGT TGTAGCCATA GGAAGGTTGT TCTTAACAGC CCCGAAGCAC ATACCATTCA 10740 

60 

TTCATATTAT CTACTTAAGT GTTTGTTTCA ATCTTTATGC TCAGTTGGAC TCGGTCTAAT 10800 
ACTAGAACTA TTTTCCGAAT CTACCCTAAC CATCCTAGCA GTTTTAGAGG AGCCCCATTT 10860 
6 5 GGACAATTGG CTGGGTTTTT GTTAGTTGTG AGAGTTTCTG CTATTTCTTA ATCAGGTGGC 10920 
CTTGGACTCT GAGGATGCAC TCTTTGGTGG ATTCAGCAGG CTTGATGATG ATGTCGACTA 109 80 
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CTTCACAACC GTAAGTCTGG GCTCAAGCGT CACTTGACTC GTCTTGACTC AACTGCTTAC 11040 

AAATCTGAAT CAACTTCCCA ATTGCTGATG CCCTTGCAGG AACATCCGCA TGACAACAGG 11100 

5 CCGCGCTCTT TCTCGGTGTA CACTCCGAGC AGAACTGCGG TCGTGTATGC CC TT AC AG AG 11160 

TAAGAACCAG CAGCGGCTTG TTACAAGGCA AAGAGAGAAC TCCAGAGAGC TCGTGGATCG 112 2 0 

TGAGCGAAGC GACGGGCAAC GGCGCGAGGC TGCTCCAAGC GCCATGACTG GGAGGGGATC 112 80 

10 

GTGCCTCTTC CCCAGATGCC AGGAGGAGCA GATGGATAGG TAGCTTGTTG GTGAGCGCTC 113 40 

GAAAGAAAAT GGACGGGCCT GGGTGTTTGT TGTGCTGCAC TGAACCCTCC TCCTATCTTG 11400 

15 CACATTCCCG GTTGTTTTTG TACATATAAC TAATAATTGC CCGTGCGCTC AACGTGAAAA 11460 
TCC 1146 3 

(2) INFORMATION FOR SEQ ID NO: 11: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2662 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS. single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

3 0 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

35 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1.. 2651 

(D) OTHER INFORMATION:/product= "nucleotide sequence of 

4 0 cDNA wheat SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TCTCCCACTC TTCTCTCCCC GCGCACACCG AGTCGGCACC GGCTCATCAC CCATCACCTC 6 0 

45 

GGCCTCGGCC ACCGGCAAAC CCCCCGATCC GCTTTTGCAG GCAGCGCACT AAAACCCCGG 12 0 

GGAGCGCGCC CCGCGGCAGC AGCAGCACCG CAGTGGGAGA GAGAGGCTTC GCCCCGGCCC 180 

50 GCACCGAGCG GGGCGATCCA CCGTCCGTGC GTCCGCACCT CCTCCGCCTC CTCCCCTGTC 2 40 

CCGCGCGCCC ACACCCATGG CGGCGACGGG CGTCGGCGCC GGGTGCCTCG CCCCCAGCGT 3 00 

CCGCCTGCGC GCCGATCCGG CGACGGCGGC CCGGGCGTCC GCCTGCGTCG TCCGCGCGCG 3 60 

55 

GCTCCGGCGC TTGGCGCGGG GCCGCTACGT TGCCGAGCTC AGCAGGGAGG GCCCCGCGGC 42 0 

GCGCCCCGCG CAGCAGCAGC AACTGGCCCC GCCGCTCGTG CCAGGCTTCC TCGCGCCGCG 4 80 

60 GCCGCCCGCG CCCGCCCAGT CGCCGGCCCC GACGCAGCCG CCCCTGCCGG ACGCCGGCGT 54 0 

GGGGGAACTC GCGCCCGACC TCCTGCTCGA AGGGATTGCT GAGGATTCCA TCGACAGCAT 600 
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AATTGTGGCT GC AAGTG AG C AGGATTCTGA GATCATGGAT GCGAATGAGC AACCTCAAGC 6 60 
TAAAGTTACA CGTAGCATCG TGTTTGTGAC TGGTGAAGCT GCTCCTTATG CAAAGTCAGG 72 0 

5 

GGGGCTGGGA GATGTTTGTG GTTCGTTACC AATTGCTGTT GCTGCTCGTG GTCACCGTGT 7 80 
GATGGTTGTA ATGCCAAGAT AGTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT 84 0 
10 ATACACTGGG AAGCACATTA AGATTCCATG CTTTGGGGGA TCAGATGAAG TGACCTTTTT 9 00 
TCATGAGTAT AGAGACAACG TCGATTGGGT GTTTGTCGAT CATCCGTGAT ATCATAGACC 96 0 
AGGAAGTTTA TATGGAGATA ATTTTGGTGC TTTTGGTGAT AATCAGTTCA GATACACACT 10 2 0 

15 

CCTTTGCTAT GCTGCATGCG AGGCCCCAGT AATCGTTGAA TTGGGAGGAT ATATTTATGG 1080 
ACAGAATTGG ATGTTTGTTG TGAAGGATTG GCATGCCAGC CTTGTGCCAG TGCTTGTTGC 1140 

2 0 TGCAAAATAT AGACCATACG GTGTTTACAG AGATTCCCGC AGCACCCTTG TTATACATAA 1200 

TTTAGCACAT GAGGGTCTGG AGCCTGCAAG TAGATATCCT GATCTGGGAT TGCCACGTGA 12 60 
ATGGTATGGA GGTTTAGAAT GGGTATTTCC AGAATGGGCA AGGAGGCATG CCCTTGACAA 13 2 0 

25 

GGGTGAGGCA GTTAACTTTT TGAAAGGAGC AGTCGTGAGA GCAGATCGAA TTGTGACCGT 13 30 
CAGTCAGGGT TATTCATGGG AGGTGACAAC TGCTGAAGGT GGACAGGGCC TCAATGAGCT 1440 

3 0 GTTAAGCTGC GGAAAAAGTG TATTGAATGG AATTGTAAAT GGAATTGACA TTAATGATTG 150 0 

GAACCCCACC ACAGACAAGT GTCTCCCTCA TCATTATTCT GTCGATGACC TCTCTGGAAA 15 6 0 
GGCCAAATGT AAAGCTGAAT TGCAGAAGGA GCTGGGTTTA CCTGTAAGGG AGGATGTTGC 162 0 

35 

TCTGATTGGC TTTATTGGAA GAGTGGATTA CCAGAAAGGC ATTGATCTCA TTAAAATGGC 168 0 
CATTCCAGAG CTCATGAGGG AGGACGTGCA GTTTGTCATG CTTGGATCTG GGGATCCAAT 17 4 0 

4 0 TTTTGAAGGC TGGATGAGAT G T AC C GAG TC GAGTTACAAG GATAAATTCC GTGGATGGGT 180 0 

TGGATTTAGT GTTGCAGTTT GCCACAGAAT AAGTGGAGGT TGCGATATAT TGTTAATGCC 1860 
ATCCAGGTTT GAACCTTGTG GTCTTAATCA GCTATATGCT ATGCAATATG GTACAGTTCC 1920 

45 

TGTAGTTCAT GGAACTGGGG GCCTCCGAGA CACAGTCGAG ACCTTGAACC CTTTTGGTGC 1980 
AAAAGGAGAG GAGGGTACAG GGTGGGCGTT CTCACCGCTA ACCGTGGACA AGATGTTGTG 2 040 
50 GGCATTGCGA ACCGCGATGT CGACATTCAG GGAGCACAAG CCGTCCTGGG AGGGGCTCAT 2100 
GAAGCGAGGC ATGACGAAAG ACCATACGTG GGACCATGCC GCCGAGCAGT ACGAGCAGAT 2160 
CTTCGAATGG GCCTTCGTGG ACCAACCCTA CGTCATGTAG ACGGGGACTG GGGAGGTCGA 2 22 0 

55 

AGCGCGGGTC TCCTTGAGCT CTGAAGAGAT GTTCCTCATC CTTCCGCGGC CCGGAAGGAT 22 8 0 
ACCCCTGTAC ATTGCGTTGT CCTGCTACAG TAGAGTCGCA ATGCGCCTGC TTGCTTGGTC 2340 
60 CGCCGGTTCG AGAGTAGATG ACGGCTGTGC TGCTGCGGCG GTGACAGCTT CGGGTGGATG 24 00 
ACAGTTACAG TTTTGGGGAA TAAGGAAGGG ATGTGCTGCA GGATGGTTAA CAGCAAAGCA 24 60 
GCACTCAGAT GGCAGCCTCT CTGTCCGTGT TACAGCTGAA ATC AG AAACC AACTGGTGAC 2 520 

65 

TCTTTAGCCT TAGCGATTGT GAAGTTTGTT GC ATTCTGTG TATGTTGTCT TGTCCTTAGC 2580 
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TGACAAATAT TAGACCTGTT GG AG AATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2 6 40 
GTTAAAAAAA AAAAAAAAAA AA 26 62 

5 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



15 



30 



45 



(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 



(ix) FEATURE: 
2 0 (A) NAME/ KEY: Protein 
(B) LOCATION: L.768 

(ix) FEATURE: 
(A) NAME/KEY: Protein 
2 5 (B) LOCATION: 1.. 768 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence SBE II" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 
15 10 15 



Pro Ala Ala Ala Gin Pro Glu Glu Leu Gin lie Pro Glu Asp lie Glu 
35 20 25 30 

Glu Gin Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 

35 40 45 

40 Glu Ser Ser Glu Pro Thr Gin Gly lie Val Glu Thr lie Thr Asp Gly 

50 55 60 



Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 

65 70 75 80 

Val Pro Lys Pro Gly Asp Gly Gin Lys He Tyr Glu He Asp Pro Thr 

85 90 95 



Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 
50 100 105 110 

Arg He Arg Ala Ala II- Asp Gin His Glu Gly Gly Leu Glu Ala Phe 

115 120 125 

55 Ser Arg Gly Tyr Glu Ly:i f.eu Gly Phe Thr Arg Ser Ala Glu Gly He 

130 135 140 



Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 
145 150 155 160 



60 
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Asp Phe Asn Asa Trp Asn Pro Asn Ala Asp Thr Met Thr Arg Asp Asp 

165 170 175 

Tyr Gly Val Trp Glu lie Phe Leu Pro Asn Asn Ala Asp Gly Ser Pro 

5 • 180 185 190 

Ala lie Pro His Gly Ser Arg Val Lys lie Arg Met Asp Thr Pro Ser 

195 200 205 

10 Gly Val Lys Asp Ser lie Ser Ala Trp He Lys Phe Ser Val Gin Ala 

210 215 220 



15 



30 



45 



60 



Pro Gly Glu He Pro Phe Asn Gly He Tyr Tyr Asp Pro Pro Glu Glu 

225 230 235 240 

Glu Lys Tyr Val Phe Gin His Pro Gin Pro Lys Arg Pro Glu Ser Leu 

245 250 255 



Arg He Tyr Glu Ser His lie Gly Met Ser Ser Pro Glu Pro Lys He 

20 260 265 270 

Asn Ser Tyr Ala Asn Phe Arg Asp Glu Val Leu Pro Arg He Lys Arg 
275 280 285 

25 Leu Gly Tyr Asn Ala Val Gin He Met Ala He Gin Glu His Ser Tyr 

290 295 300 



Tyr Ala Ser Phe Gly Tyr His Val Thr Asn Phe Phe Ala Pro Ser Ser 
305 310 315 320 

Arg Phe Gly Thr Pro Glu Asp Leu Lys Ser Leu He Asp Arg Ala His 

325 330 335 



Glu Leu Gly Leu Leu Val Leu Met Asp lie Val His Ser His Ser Ser 

35 340 345 350 

Asn Asn Thr Leu Asp Gly Leu Asn Gly Phe Asp Gly Thr Asp Thr His 
355 360 365 

4 0 Tyr Phe His Gly Gly Pro Arg Gly His His Trp Met Trp Asp Ser Arg 

370 375 380 



Leu Phe Asn Tyr Gly Ser Trp Glu Val Leu Arg Phe Leu Leu Ser Asn 
385 390 395 400 

Ala Arg Trp Trp Leu Glu Glu Tyr Lys Phe Asp Gly Phe Arg Phe Asp 

405 410 415 



Gly Val Thr Ser Met Met Tyr Thr His His Gly Leu Gin Met Thr Phe 
50 420 425 430 

Thr Gly Asn Tyr Gly Glu Tyr Phe Gly Phe Ala Thr Asp Val Asp Ala 

435 440 445 

55 Val Val Tyr Leu Met Leu Val Asn Asp Leu He His Gly Leu His Pro 

450 455 460 



Asp Ala Val Ser He Gly Glu Asp Val Ser Gly Met Pro Thr Phe Cys 

465 470 475 480 

He Pro Val Pro Asp Gly Gly Val Gly Phe Asp Tyr Arg Leu His Met 

485 490 495 
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Ala Val Ala Asp Lys Trp He Glu Leu Leu Lys Gin Ser Asp Glu Ser 

500 505 510 

Trp Lys Met Gly Asp He Val His Thr Leu Thr Asn Arg Arg Trp Leu 

5 515 520 525 

Glu Lys Cys Val Thr Tyr Ala Glu Ser His Asp Gin Ala Leu Val Gly 
530 535 540 

in Asp Lys Thr He Ala Phe Trp Leu Met Asp Lys Asp Met Tyr Asp Phe 

545 550 555 560 



15 



30 



45 



60 



Met Ala Leu Asp Arg Pro Ser Thr Pro Arg He Asp Arg Gly He Ala 

565 570 575 

Leu His Lys Met He Arg Leu Val Thr Met Gly Leu Gly Gly Glu Gly 

580 585 590 



Tyr Leu Asn Phe Met Gly Asn Glu Phe Gly His Pro Glu Trp He Asp 
20 595 600 605 

Phe Pro Arg Gly Pro Gin Thr Leu Pro Thr Gly Lys Val Leu Pro Gly 
610 615 620 

2 5 Asn Asn Asn Ser Tyr Asp Lys Cys Arg Arg Arg Phe Asp Leu Gly Asp 

625 630 635 640 

Ala Asp Phe Leu Arg Tyr His Gly Met Gin Glu Phe Asp Gin Ala Met 

645 650 655 



Gin His Leu Glu Glu Lys Tyr Gly Phe Met Thr Ser Glu His Gin Tyr 

660 665 670 



Val Ser Arg Lys His Glu Glu Asp Lys Val He He Phe Glu Arg Gly 
35 675 680 685 

Asp Leu Val Phe Val Phe Asn Phe His Trp Ser Asn Ser Phe Phe Asp 
690 695 700 

4 0 Tyr Arg Val Gly Cys Ser Arg Pro Gly Lys Tyr Lys Val Ala Leu Asp 

705 710 715 720 



Ser Asp Asp Ala Leu Phe Gly Gly Phe Ser Arg Leu Asp His Asp Val 

725 730 735 

Asp Tyr Phe Thr Thr Glu His Pro His Asp Asn Arg Pro Arg Ser Phe 

740 745 750 



Ser Val Tyr Thr Pro Ser Arg Thr Ala Val Val Tyr Ala Leu Thr Glu 
50 755 760 765 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10550 base pairs 
5 5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

(ix) FEATURE: 
5 (A) NAME/KEY: exon 

(B) LOCATION:!. .316 

(D) OTHER INFORMATION:/product= "exon 1" 

(ix) FEATURE: 
10 (A) NAME/KEY: exon 

(B) LOCATION: 1472.. 1828 

(D) OTHER INFORMATION:/product= "exon 2" 

(ix) FEATURE: 
15 (A) NAME/KEY: exon 

(B) LOCATION:2766..2823 

(D) OTHER INFORMATION:/product= "exon 3" 

(ix) FEATURE: 
2 0 (A) NAME/ KEY: exon 

(B) LOCATION:2906..3028 

(D) OTHER INFORMATION:/product= "exon 4" 

(ix) FEATURE: 

2 5 (A) NAME/KEY: exon 

(B) LOCATIONS! 13. .4194 

(D) OTHER INFORMATION:/producu= "exon 5 M 

(ix) FEATURE: 

3 0 (A) NAME/KEY: exon 

(B) LOCATION.4286..4459 

(D) OTHER INFORMATION:/product= "exon 6" 

(ix) FEATURE: 
3 5 (A) NAME/KEY: exon 

(B) LOCATION:4562..4643 

(D) OTHER INFORMATION:/product= "exon 7" 

(ix) FEATURE: 
40 (A) NAME/KEY: exon 

(B) LOCATION:4744.,4855 

(D) OTHER INFORMATION:/product- "exon 8" 

(ix) FEATURE: 
45 (A) NAME/KEY: exon 

(B) LOCATION.4999..5021 

(D) OTHER INFORMATION:/product= "exon 9" 

(ix) FEATURE: 
50 (A) NAME/KEY: exon 

(B) LOCATION: 5 1 02 5 1 92 

(D) OTHER INFORM ATION:/product= "exon 10" 

(ix) FEATURE: 
5 5 (A) NAME/ KEY: exon 

(B) LOCATION:8593..8718 
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(D) OTHER INFORMATION:/produc(= "exon 1 1" 

(ix) FEATURE: 
(A) NAME/ KEY: exon 
5 (B) LOCATION:8807..89i5 

(D) OTHER INFORMATION :/product= "exon 12" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
10 (B) LOCATION:8992..9l04 

(D) OTHER INFORM ATION:/product= "exon 13" 

(ix) FEATURE: 
(A) NAME/ KEY: exon 
15 (B) LOCATIONS 1 61. .9 1 99 

(D) OTHER INFORM ATION:/product= "exon 14" 

(ix) FEATURE: 
(A) NAME/ KEY: exon 
2 0 (B) LOCATION:9498„9713 

(D) OTHER INFORM ATION:/product= "exon 15" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

2 5 ATGGCGGCGA CGGGCGTCGG CGCCGGGTGC CTCGCCCCCA GCGTCCGCCT 50 

GCGCGCCGAT CCGGCGACGG CGGCCCGGGC GTCCGCTTGC GTCGTCCGCG 100 

CGCGGCTCCG GCGCTTGGCG CGGGGCCGCT ACGTCGCCG A GCTCAGCAGG 1 50 

30 

GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 200 

CGTGCCAGGC TTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 250 

3 5 CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 300 

GACCTCCTGC TCGAAGGTAA AAAACAAGGC TGAATCCTCA GATCACTCCG 350 

CGTCTTCGTT TTACCAAATA CGGTACTGCG AAGTGGTGCT GTATATGTGA 400 

40 

AGTTTCTGTC GATTTCTTCC TG ACGGATGT TCAGTCG ATT CAGTTGTATA 450 

TATGTGATAC GTTCGTTGTT CATCGATCGT ACAG ATTTAC CAGCACACTA 500 

4 5 GATAGAAATC GAGACCGACG CGGGCAG ATC AATAGATTTT TCTAGACGTT 550 

TTATTGGATC GTGAGATGAT TG ATTGGGGT GGCGTGTCG A TACGATAGCG 600 

GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 650 

50 

CATATCACTA GACTGGTATC GTAATTTACT AGTACTACTG GAAAGAGGAC 700 

TAAAAAGGCT AGGCCAAGTG CACGCATGTT GGGAACGTTG TTAAATTGAT 750 

5 5 G AGTTTGTCC TTTGCTTGCG CTGGTATTAT TACC AAAAAA TGGTGTTAGT 800 
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CCCTGTACTT ATTAATGGGA AAATCTTAAC ATGACACTGG GGTTTATGAG 850 

TCTCCA ATTG TATATTCTCA GC ACTC AACT GATTTTACTG ATACTGTAGT 900 

5 GGAAATGACA CGTGAGCACC CCCCTTCAAG G AATGCAATG CTTCTTTCTG 950 

TTTTATATTA CAGGAACTAG AAGG AGCTTC CACCTTTGAG TACAGAAGTA 1000 

CTCCCTCCGT TCCAAAATAG ATG ACTCA AC TTTGTACTA A TTTTGTACTA 1 050 

10 

TAGTTAGTAC A AAGTTGAGT CATCTATTTT AGAACGGAGG GAGTAGTATC 1 1 00 

G AAATTG A AG ACCCTTGTAT TACTGTCTTG TTTTTCA ATG AA AATGGGAG 1 1 50 

1 5 GCCC ATGCAG TA AGTCACAT GGGC ACCTGG G AGGCTGGG A TC ATGTGTGC 1 200 

TTTGC AGAGT ACTAG ACCC A GCTC ACCCTC TGTTAGATTA CTTGTTGGGC 1 250 

TGCTACTTTG TGTTTGCTGT GCAGTATATC AG AC ATCCTG A ATTTGGC AT 1 300 

20 

CTAGCTG AG A ACAG AATGC A GGTTGC ACCA TTCTTATTAT TGCTA AACTG 1 350 

TTGTCACGCA ATTTATAAAG A ATGTG ATCT TCTGAGTATT AATTAATCAT 1400 

2 5 GTTCTGCTAA TATCTGTCCT CGCTCTGGTG TTGACAAATA TACCATATGA 1450 

ATATTTTCCA TTTTGC AACC AGGG ATTGCT GAGG ATTCC A TCG AC AGCAT 1 500 

AATCGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC 1550 

30 

A ACCTCA AGC TA AAGTTACA CGTAGCATCG TGTTTGTGAC TGGTG A AGCT 1 600 

GCTCCTTATG CAAAGTCAGG GGGGCTGGGA GATGTTTGTG GTTCGTTACC 1650 

3 5 A ATTGCTCTT GCTGCTCGTG GTCACCGTGT G ATGGTTGT A ATGCC A AG AT 1 700 

ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT ATACACTGCG 1750 

A AGCAC ATTA AG ATTCC ATG CTTTGGGGG A TCACATG AAG TGACCTTTTT 1 800 

40 

TCATGAGTAT AG AG ACAACG TCG ATTGGGT GGGTACACAA TCACCTTCTT 1 850 

ATTCTCTGTT G A ATTG T AGC AACTGTTTAT CCTTGTTTAC ACTTCTTTTA 1900 

4 5 GCCCTGCAAA G AC AT ATGTG ATTTCCATAC TTTTTTGTTA TTTCCCTTGT 1950 

ACTCTTGCTC ATGAAGGTCA AAATATCATA TATCCATGGA AGTCATGCAT 2000 

GTGCCTAGTA TTTTTGGTGT CGGTGCCTTT AACTTTCAGG GATTAATACG 2050 

50 

TGG A ATTTGA TA ACTA AAGT TTATTTT ATT G AA A AA AATT GTAGGTTGG 2 1 00 

TGAGCCCACA GCCACGCAGT GGCACCACTG CTTGCACATG ATTTTGCATT 2150 

5 5 TCTGTTTGC A CCGAGCACTT CATGTGAATA AGGTGTAAA A TCATAAAGTA 2200 
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CCAATTTTAT TCTGCCAATT GCACTTAAGA GTATATACAT TFATCTTGGC 2250 

CTCAATCATG GGAGTACTGT GCATTCAGTG CACCATCATT GTTCTAAGGA 2300 

5 GAAAATGTGG GTGCAAGGAA GACACTTTTG TCCCTTAATA AAAGGCAGGC 2350 

ACTCTGTTGT CATATAGATA GAAAGCAACA AACTTATTTC AAAGAGCTAA 2400 

CAATGGC AA A AGAACCAAAA AAAGCATGCT AAGGCGGTGA CACCAAAAGG 2450 

10 

TGAGGGGGGC CTTGTGACTG ACAGCACCCC AAACTATTGC CATTGTTTTA 2500 

CTAAATGAAG ATC ATTTT AG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 

1 5 TTCCGTCCAC AGATCGTCTG TT A AT ATTTT TGTCCAGTGA TACTTTTTTT 2600 

GCTCCTTACA AGAGTGCCTA TGTTGACATA TACATTGTTA A G TTGTTC A T 2650 

AAGTTTACTT CTTATTCTAA ACAGCAAGTG CCTAATGCTT GCATTTATTT 2700 

20 

TGGCTATTTA TTTTTATTCT CATTTCAATC AACACTTTTG TTCAGGTGTT 2750 

TGTCGATCAT CCGTCATATC ATAGACCAGG AAGTTTATAT GGAGATAATT 2800 

2 5 TTGGTGCTTT TGGTG AT A AT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 

TTGACTAAGT CGTA AGTTGT ACCTCCTCGC TGACCGGCTG CTCTATGTCG 2900 

TGCAGTTCAG ATACACACTC CTTTGCTATG CTGCATGCGA GGCCCCACTA 2950 

30 

ATCCTTGAAT TGGGAGGATA TATTTATGGA CAGAATTGCA TGTTTGTTGT 3000 

GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTTGTT TGTGGATCTG 3050 

3 5 A A AGTCC A AT CCTTTATTC A TTCTCTGCTT TGC AGTGTGC CC ATGTCTAC 3 1 00 

ATTTCTTTTA TGCTTTTTTC ATGTCTGTTC TTATATTGCA TATATGCTTA 3 1 50 

TGGAGTCTAA AAGTTACCGG AGGGAATAAC TCTTAAGGAT TTCCTCAATC 3200 

40 

AATTATCTTT AGCTTTAGTT AACATTTACT GTGGCAAACA TAATGTGTTT 3250 

TGAGATTTAC AAGTTCAGAG ATTGCACTTC ACTAGTTCGT AGCTAATCTG 3300 

4 5 ATGTTTTCCC CGAGAAAATG CCTAAAGCTT TGTGTCTTG A TGCATTGATA 3350 

GAA AAAGAGT TTATGTACAC TCCCAAAGAG GGGACCCAAA ATTACAACAC 3400 

CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATGC AAGCCCCACT 3450 

50 

GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTTGATTGT GTCAAGTAAG 3500 

CTAGCAGTGC TAGATTGCGC AAGGTCGATT CGTCGAAGAT GACAGTGTTG 3550 

5 5 CGCTGCTTCC AAATCCACCA AACTATGAGC ATG ATCACTG GAGAAGTACC 3600 
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TTTTCTCGCG GCTGAGGGGG TGGACTGGTG GTCTGCTGCT GCCAGTTTTC 3650 

AGATAATCTG AAAAA TGCAT GTTTTGATGA TTTTAGTATC TTGCGGACCC 3700 

5 TGGGTACCAC CTAAGCTTTC ACACAGTAAT TTGCAGTTAC ACCTATAAAA 3750 

GTAACGGTCA TGATATGCAT GTGTTTTGGG TAGATCATGG TGCATGCATT 3800 

TTAGGAATTA GGACATGCCA GAACCACGTG AGGCTTATGG GGCAATTCAT 3850 

10 

TTGTTCCATT ATACG AGTCA TGAATATGGT TCAGCATGTT TGGACGCTAC 3900 

TTGTTTG GGG CAATTTCAGA TGGTGAATTG TAGCTGCTTG ATGTTGGCTA 3950 

1 5 GCTGGCTTAT TTTGTAC AAG TATCGATGTT AGATGCATAT TTCCTTTTGT 4000 

TCTTGTGCTG TTTGCCATGT TGTATTCCCC TTTTCTGTCG CCAGTGTTGC 4050 

ATGTT A A ATT GGTTTTC ATT AC AT A ATC A A CTTTGTTGCT G ACATCAGTC 4 1 00 

20 

ATTTTTATTC AGCCTTCTTG CTGCAAAATA TAGACCATAC GGTGTTTACA 4150 

GAGATTCCCG CAGCACCCTT GTTATACATA ATTTAGCACA TCAGGTTTGG 4200 

2 5 GTCTATCACC TTTCATTATC CGTACATGGC TTTGTAAGTC GGTTCACACG 4250 

TATCGTCATA CTGTATGTTA TTTCAATGTC ATT A GGG TGT GGAGCCTGCA 4300 

AGTACATATC CTGATCTGGG ATTGCCACCT GAATGGTATG GAGCTTTAGA 4350 

30 

ATGGGTATTT CCAG AATGGG CAAGG AGGCA TGCCCTTGAC AAGGGTGAGG 4400 

CAGTTAACTT TTTGAAAGGA GCAGTTGTGA CAGCAGATCG AATTGTGACC 4450 

3 5 GTCAGTCAGG TGAAATACTC AATACTTCTC TTTTTTCTTT GCGGG ATGTT 4500 

CTTCAGTTCA ATTGCCCTGT CTTTCACCCA ATTAAGAAAT GATTTAATCT 4550 

TTTGTTTCTA GGGTTATTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 4600 

40 

GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGT ATTG A ATGGTAACTA 4650 

TATTTGAATC CACTTATCTT CTTCTGAAAC ATATTTACAG AAATAGATGG 4700 

4 5 ATGGGTTGC A AGAATAAATT CAGTTTGCTC TTTCGGTATG AAGGAATTGT 4750 

AAATGGAATT GACATTAATG ATTGGAACCC CACCACAGAC AAGTGTCTCC 4800 

CTCATCATTA TTCTGTCGAT GACCTCTCTG GAAAGGTGTG TGGATAGTAC 4850 

50 

CCTATATAAT AACATGTATA TCTG ATCTAG TACTTTCTTT TTCTTTGCTA 4900 

GTTTGCTTCC CATGATGTTC TC ACTA ACTA ATCCTATGTG GTTTGGCATA 4950 

5 5 CTTGTCAGGC C AAATGTAAA GCTGAATrGC AG AAGGAGCT GGGTTTACCT 5000 
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GTAAGGGAGG ATGTTCCTCT GGTTAGATAC AAACCCCTAA GATATATATT 5050 

TTTTA AATCC CTAA A AAAAA CTTGCCGATC ATCTC ATTAG CTTGATTC AC 5 1 00 

5 AG ATTGGCTT TATTGGAAGA CTGGATTACC AG AAAGGCAT TG ATCTC ATT 5 1 50 

AAAATGGCCA TTCCAGAGCT CATGAGGGAG GACGTGCAGT TTGTAAGTTC 5200 

ATATTCTTTT TCTTG AG ACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 5250 

10 

TGGTATAATA C AG AC ATA AG TTCCAGCTAT TGCTTCCATG AGAATTTTAA 5300 

TGCTATTCAG TAATATGCTA CTGCAAGTTT TGAAACAAAG TTGGAAGCAA 5350 

1 5 TAAATATATG TGTAGCACTG ACC ATGCAGT GCCACTATAG CTGGAATGTC 5400 

CTGTA GTCT A TGTGATCTAA CACACTCAAC AACATGTTTT CGCATACAAA 5450 

CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCTTGGTGA 5500 

20 

ACTGCAGACA TGCTCTTATC TCCATTCCAA CATTTCTTGT TTCAACATTG 5550 

GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC AACTAGATCC 5600 

2 5 AGTAAGGAAG CTAGCCGAGC CTAGGAGGAT TCGCTTAGGT AGCTGGAACG 5650 

TAGGGTCTCT GACAGGGAAG CTTCGGGAGC TAGTCGATGC AGTGGTGAGG 5700 

AGAGGTGTTG ATATCCTTTG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 5750 

30 

GGCGAAGGAG GTGGAGGATA CCGGCTTCAA GCTGTGGTAC ATGGGACGGC 5800 

TGCAAAC AGA AATGGCGTAG GCATCTTGAT C AACAAG AGC CTTAAGTATG 5850 

3 5 GAGTGGTAGA CGTCAAGAGA CGTGGGGACC GGATTATCCT CGTCAAGCTG 5900 

GTAGTTGGGG A CTTA GTTCT CAATGTTATC AGCGTGTATG CCCCGCAAGT 5950 

AGGCCACAAT GAG AACGCCA AGAGGGAGTT CTGGGAAGGC CTGGAAGACA 6000 

40 

TGGTTAGGAG TGTACCGATT GGCGAGAAGC TCTTCATAGG AGGAGACCTC 6050 

AATGGCC ACG TGGGTAC ATC TA AC ATAGGT TTTG A AGGGG C AC ATGGGGG 6 1 00 

4 5 CTTTGGCTAT GGC ATC A AG A ATC A AG AAG A AG ATGTCTTA CGCTTTGCTC 6 1 50 

TAGCCTACGA CATGATTGTA GCTAACACCC TCTTTAGAAA GAGAGAATCA 6200 

CATCTGGTG A CTTTTAGTAG TGGCCAACAC TAGCCAGATC GATTTCATCC 6250 

50 

TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 6300 

TCGGATTCGT GTCCAGCGGG ATAAGCGTGC CAAAGTCGCT AGAATGAAGT 6350 

5 5 GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTTCAAGGA GAGGGTCATT 6400 
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AGGGAGGGCC CTTGGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 

GATGGCGACT TGCATTCGTA AGGTGGCCTC GGAGG AGTGT GGAGTGTCCA 6500 

5 GGGGATGGAG AAGCGAAGAT AAGG ATACCT GGTGGTGGA A TGATGATGTC 7000 

CAGAAGGCAA TTAAAGAGAA GAAAGATTGC TTTAGACGCC TATACTTGGA 7050 

TAGG AGTGC A GTC AACATAG AAAAGTACA A G ATGGCG A AG AAGGCCGC AA 7 1 00 

10 

AGCG AGCTGT CAGTGAAGCA AGGGGTCGGG C ATATGAGGA TCTCTACCAA 7 1 50 

CGGTTAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCCAAGAT 7200 

CCGAG AG AG A GGAAGACG AG GGATATTGGC CAAGTCAAAT GCATCAAGGA 7250 

1 5 TGGAGCAGAC C A ACTCTTGG TGAAGGACGA GGAGATTAAG CATAG ATGGC 7300 

GGGAGTACTT CG AC AAGCTG TTCAATGGGG AGGATGAGAG TCCTACCATT 7350 

GAACTTGACG ACTCCTTTGA TGAGACCATC ATGCGTTTTA TGCGGCGAAT 7400 

CCAGGAGTCC GAGGTCAAGG AGGCTTTAAA AAGGAGGCAA GGCGATGGGC 7450 

CCTGATTGTA TCCCCATTGA GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 

2 0 AGTATGGCTA ACCAAGCTAT TCAACCTCAT TTTTCGGGCA AACAAGATGC 7550 

CAG AAGAATG GAGACGAAGT ATATTAGTAC CAATCATCAA AC AGGGGGGA 7600 

TGTTCAG AGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATAC AA 7650 

TGAAGCTATG GGAGAGAATC ATTGAGCACC GCTTA AG A AG AATGACAAGC 7700 

GTGACCAAAA ATCAGTTTGG TTTCATGCCT GGGAGGTCGA CCATGGAAAC 7750 

2 5 CATTTTCTTG GTACGACAAC TTATGGAGAG ATACAGGGAG CAAAAGAAGG 7800 

ACTTGCATAT GGTGTTCATT GACTTGAAGA AGGCCTATAA TAAGATACCG 7850 

CGGAATGTCA TGTGGTGGGC CTTGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 

CATTACCCTC ATCAAGGACA TGTACGATAA TGTTGTGACA AGTGTTCG A A 7950 

CAAGTGATGT CG AC ACTA AT GACTTCCCGA TTAAGATAGG ACTGCATCAG 8000 

3 0 GGGTCAGCTT TGAGCCCTTA TCTTTTTGCC TTGGTGATGG ATGAGGTCAC 8050 

AAGGGATATA C AAGG AGATA TCCCATGGTG TATGCTCTTT GTGGATGATT 8 1 00 

TGGTGCTAGT TGACGATAGT CGGGCGGGGG TAAATAACAA GTTAGAGTTA 8 150 

TGGAGACAAA CCTTGGAATC GAAAGGGTTT AGGCTTAGTA GAACTAAAAC 8200 

CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTG AG GAGGAGGAGG 8250 
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TTAGCCTTG A TGGGC AGGTG GTACCCC AG A AGG ACACCTT TCG ATATTTG 8300 

GGGTCAATGC TGCAGGAGGA TGGGGGTATT GATGAAGATG TGAACCATCG 8350 

AATCAAAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCTTTGTG 8400 

ACAAGAGAGT GCCACAAAAG CTAAGGCAAG TTCTACAGGA CGGCGGTTCG 8450 

5 ACCCGCAATG TTGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 

TTCAACAGTT AGGTGTGGCG GAGATGCGTA TGTTGAGATG GATGTGTGGC 8550 

CACACGAGGA AGRATCGAGT CCGGAATGAT GATATACGAG ATAGAGTTGG 8600 

GGTAGCACCA ATTGAAGAGA AGCTTGTCCA ACATCGTCTG AGATGGTTTG 8650 

GGCATATTCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 

1 0 AAGCGTGCGG AGAATGTC AA GAG AGGGCGG GGTAGACCGA ATTTG AC ATG 8750 

GGAGGAGTCC GTTAAGAGAG ACCTG AAGGT TTGGAGTATT ACGAAAGAAC 8800 

TAGCTATGGA CARGGGTGCG TGGAAGCTrG TTATCCATGT GCCAGAGCCA 8850 

TGAGTTGATC ACGAGATCTT ATGGGTTTCA CCTCTAGCCT ACCCCAACTT 8900 

GTTTGGGACT AAAGGCTTTG TTGTTGTTGT TGTTGTTGTT GTTGTAGCCA 8950 

1 5 ACTAAATCCA GTTG ATC AGT GGTTTTTACT CTTATTTTTA C AGGTCATGC 9000 

TTGGATCTGG GGATCCAATT TTTGAAGGCT GGATGAGATC TACCGAGTCG 9050 

AGTTAC A AGG ATA A ATTCCG TGG ATGGGTT GG ATTTAGTG TTCCAGTTTC 9 1 00 

CC AC AG AATA ACTGC AGGGT ATGCCG AG AA CTTCTTA AC A AG ACCTTCGT 9150 

TATCAGCTTG G ATATATTAT AATGTTC AAA AC ATTTATGT CTCTCTTTTT 9200 

2 0 GTGCAGTTGC G AT AT ATTGT TAATGCCATC CAGGTTTGAA CCTTGTGGTC 9250 

TTAATCAGCT ATATGCTATG CAATATGGTA CAGTTCCTGT AGTTCATGGA 9300 

ACTGGGGGCC TCCGAGTAAG ACA ACTGCCT TG A A A ATT AT CGTTATCTTG 9350 

GCTCCAACGC AAATGTTCTA ATTGGCTCGT GTATTCAACA GGACACAGTC 9400 

GAGACCTTCA ACCCTTTTGG TGCAAAAGGA GAGGAGGGTA CAGCGTACGC 9450 

2 5 ACTGCTCAAT TTTAGCTAAC TTTC AGTTTA TCTTTTTGCA ATGTCTTGGG 9500 

GGTTCATTGC GCCATAAATC AACITGTGAT AATTAACTGT TACTGTTCTG 9550 

TACTTGCAGG TGGGC GTTCT CACCGCTAAC CGTGGACAAG ATGTTGTGGG 9600 

TAAGTTTTTG CTGAGCTCTT GTCCGGTTAT AGG ATCGACC TTGGCTGTAG 9650 
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CATGGTACCT TAGTGCCCCT TGTATATAGA CCTAACCTGA TGGACTCACT 9700 

TTGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TTTGCTTGG A 9750 

TTCTGCTAAT TTAATTTTCA TGACGATAAC TCATACCATG GTTTGGTTCT 9800 

CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 

5 AATGCCGGGT TGTTCCAAGT GAAAATTTAC CTTTTGACCA TTGTGCAGGC 9900 

ATTGCGAACC GCGATGTCGA CATTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 

GGCTC ATGAA GCGAGGC ATG ACGAAAGACC ATACGTGGG A CC ATGCCGCC 1 0000 

GAGCAGTACG AGCAGATCTT CGAATGGGCC TTCGTGGACC AACCCTACGT 10050 

CATGTAG ACG GGG ACTGGGG AGGTCGA AGC GCGGGTCTCC TTG AGCTCTG 1 0 1 00 

1 0 AAGACATGTT CCTCATCCTT CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 

GCGTTGTCCT GCTACAGTAG AGTCGC A ATG CGCCTGCTTG CTTGGTCCGC 10200 

CGGTTCGAGA GTAGATGACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 10250 

GTGG ATGACA GTTACAGTTT TGGGGAATA A GG AAGGG ATG TGCTGCAGGA 1 0300 

TGGTTAACAG CAAAGCACCA CTCAGATGGC AGCCTCTCTG TCCGTGTTAC 10350 

1 5 AGCTG AAATC AGAAACCAAC TGGTGACTCT TTAGCCTTAG CGATTGTG AA 10400 

GTTTGTTGC A TTCTGTGTAT GTTGTCTTGT CCTTAGCTGA CA A ATATTTG 1 0450 

A CCTGTTG G A TA ATTCTATC TTTGCTGCTG TTTTTCTTTT GGTCA A A AG A 1 0500 

GGGGTTCCCT CCGATTTCAT TAACGAAACC ACCAAAATAA CAGCACCCAG 10550 

TGCAGGTCTC AGGTTCAGAT ATACTTAAGA CTACTAAATC TA AC AGC AGC 10600 

2 0 TA A AA AGCTT AAAG ATTCAG GCG ACATAAC CG A ACAAA AT CCACAACCG A 1 0650 

AGGGACC A AA GC AGG AC AAG TA A AAAGGC A GNCG AC AC AA AGCGCAGGTC 1 0700 

GCTGAAAAGG C A AGC AG AC A GAGGTCTGCA TTCTGTCAAC ACCACTTGTG 10750 

AAAAATGAAG AG AAGATCGA GA ATTCCCGG GAATCCG 1 0787 



(2) INFORMATION FOR SEQ ID NO: 14: 
2 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 647 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 
(tii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/ KEY: Protein 

(B) LOCATION: 1.. 647 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence for SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 
15 10 15 

Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 

20 25 30 



Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 

20 35 40 45 

Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gin Gin Gin Gin Leu Ala 
50 55 60 

2 5 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala Pro Ala 

65 70 75 80 



Gin Ser Pro Ala Pro Thr Gin Pro Pro Leu Pro Asp Ala Gly Val Gly 

85 90 95 

Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly lie Ala Glu Asp Ser lie 

100 105 110 



Asp Ser lie He Val Ala Ala Ser Glu Gin Asp Ser Glu He Met Asp 

35 115 120 125 

Ala Asn Glu Gin Pro Gin Ala Lys Val Thr Arg Ser He Val Phe Val 

130 135 140 

4 0 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 

145 150 155 160 



Cys Gly Ser Leu Pro He Ala Leu Ala Ala Arg Gly His Arg Val Met 

165 170 175 

Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 

180 185 190 



Lys Ala Leu Tyr Thr Gly Lys His He Lys He Pro Cys Phe Gly Gly 

50 195 200 205 

Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 

210 215 220 

55 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 

225 230 235 240 



Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 

245 250 255 

Cys Tyr Ala Ala Cys Glu Ala Pro Leu He Leu Glu Leu Gly Gly Tyr 

260 265 270 
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10 



25 



40 



55 



He Tyr Gly Gin Asn Cys Met Phe Val Val Asn Asp Trp His Ala Ser 
275 280 285 

Leu Val Pro Val Leu Leu Ala Ala Lys Tyr Arg Pro Tyr Gly Val Tyr 
290 295 300 

Arg Asp Ser Arg Ser Thr Leu Val He His Asn Leu Ala His Gin Gly 
305 310 315 320 

Leu Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro Glu Trp 

325 330 335 



Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg His Ala 

15 340 345 350 

Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val Val Thr 

355 360 365 

2 0 Ala Asp Arg He Val Thr Val Ser Gin Gly Tyr Ser Trp Glu Val Thr 

370 375 380 



Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu Leu Leu Ser Ser Arg Lys 

385 390 395 400 

Ser Val Leu Asn Gly He Val Asn Gly He Asp He Asn Asp Trp Asn 

405 410 415 



Pro Thr Thr Asp Lys Cys Leu Pro His His Tyr Ser Val Asp Asp Leu 
30 420 425 430 

Ser Gly Lys Ala Lys Cys Lys Ala Glu Leu Gin Lys Glu Leu Gly Leu 
435 440 445 

3 5 Pro Val Arg Glu Asp Val Pro Leu He Gly Phe He Gly Arg Leu Asp 

450 455 460 



Tyr Gin Lys Gly He Asp Leu He Lys Met Ala He Pro Glu Leu Met 

465 470 475 480 

Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro He Phe 

485 490 495 



Glu Gly Trp Met Arg Ser Thr Glu Ser Ser Tyr Lys Asp Lys Phe Arg 
45 500 505 510 

Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg He Thr Ala Gly 
515 520 525 

50 Cys Asp He Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly Leu Asn 

530 535 540 



Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His Gly Thr 
545 550 555 560 

Gly Gly Leu Arg Asp Thr Val Glu Thr Phe Asn Pro Phe Gly Ala Lys 

565 570 575 



Gly Glu Glu Gly Thr Gly Trp Ala Phe Ser Pro Leu Thr Val Asp Lys 
60 580 585 590 

Met Leu Trp Ala Leu Arg Thr Ala Met Ser Thr Phe Arg Glu His Lys 
595 600 605 
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Pro Ser Trp 
610 



Glu 



Gly 



Leu 



Met Lys Arg Gly Met Thr Lys Asp His Thr 
615 620 



Trp Asp His 
625 



Ala 



Ala 



Glu 
630 



Gin Tyr Glu Gin lie Phe Glu Trp Ala Phe 

635 640 



Val Asp Gin 



Pro 



Tyr 
645 



Val 



Met 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5072 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: tnticum tauschii 

(F) TISSUE TYPE: Endosperm 
(ix) FEATURE: 

(A) NAME/ KEY: promoter 

(B) LOCATION: 1 ..4993 

(D) OTHER INFORMATION:/function= "region containing 
promoter of SSS I" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



TCTAGATGCA 


TGCTGGATAG 


CGGTCGATGT 


GTGGAGTAAT 


AGTAGTAGAT 


GCAGAATCGT 


60 


TTCGGTCTAC 


TTGTCGCGGA 


CGTGATGCCT 


ATATACATGA 


TCATACCTAG 


ATATTCTCAT 


120 


AACTATGCTC 


AATTCTATCA 


ATTGCTCGAC 


AGTAATTCGT 


TTACCCACCG 


TAATACTTAT 


180 


GATCTTGAGA 


GAAGTCACTA 


GTGAAACCTA 


TGCCCCCCAG 


GTCTATTTTG 


CATCATATTA 


240 


ATCTTCCAAT 


ACTTAGTTAT 


TTCCATTGCC 


GTTTATTTTA 


CTTTGTATCT 


TTATTTCTTT 


300 


TTATTATAAA 


AAATACCAAA 


AATATTATCT 


TATCATATCT 


ATCAGATCTC 


ATTCTCGTAA 


360 


GTGACCGTGA 


AGGGATTGAC 


AACCCCTTTA 


TCGTGTTGGT 


TGCGAGGTTC 


TTGTTTGTTT 


420 


GTGTAGGTGC 


GTGTGACTCG 


CACGTCTCCT 


ACTGGATTGA 


TACCTTGGGT 


TTTCAAAAAC 


480 


TGAGAAAAAT 


ACTTACGCTA 


CTTTACTGCA 


TAACCCTTTC 


CTCTTTAAAA 


AAAAAAACCA 


540 


ACGTAGTATT 


CAAGAGGTAG 


CACGCTACCA 


TCCTCTCCAA 


CAGGhGCGCG 


GAGATCTTTG 


600 


TCCGGCAGGT 


TGATGCGGGC 


CGGGGAAGAA 


CTCCAGCTGC 


CTTGGCCAGC 


TTGGTCGTGA 


660 


GCCGCCCCAG 


CGGCGTCTTG 


AACCTGTCCA 


CGTAGCGCTC 


CCTGACACGC 


GGCGTGAACT 


720 


GAGAAGGCTT 


GTCGATGAAC 


TCCAGCTGTT 


GTGCCAGCCT 


AGCTTGCGCC 


TTCTTCTGCT 


780 


GGGTCATGCC 


CTTCGAGAAA 


CCCACCTTGG 


CCACCCTTGT 


GCTTGAGCGG 


CGCGCCACCT 


840 


CAGCAGGCGG 


CGGCGTGGGG 


ATGAAGAGGG 


TGTCTGCTTC 


CGGAGCAGGC 


GGGTCGGCGT 


900 
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TGAACTTGAA AGGCGGTGGC CCCATGATGG ATGGGGGGAG CATGCCAAAG ACTTGGTTGA 960 

GGAAAGTGGT GTTGGCGTCC ACCTCCAGTG CCTGCAGTTT GGAAGCCAGA CGATTGGCGT 102 0 

5 CGATCTCTGG CTCCGGCTGG AAGGAGGCTC GACGCTCCGG TGTGCCAGAA CGCAAAGGGA 1080 

GGAGCGGCAG CTCTGGCTGA GCAGACCCCG CGCCCATGTA GTCTGCATTG GGCCAAGGCT 114 0 

GCAGGGGCAA GCCACCGGGA TGGGGGCGCG AGGTGGACTG CGCACCGGAG GAAGGCCAAG 12 00 

10 

CTCAACCTCG GTGAGGTTCG CCCCAGACCA GGGCGGCAGG CTCGGGTCCA CAAAGGGGCA 12 60 

AACCGGCTCG TCCGCGGCGA AACTGTCCAG GACAGACGGC GGACGACGGA AGGCCGTGTC 13 20 

15 GTCGAGCTCG AGCAGCAGAG GGTCCGTGCG GGTGATGTCT TGCCAAATGG ACTCCAGGTC 13 80 

CAGCAGGAAG GGGGAGTGGT CCATCGCCCC TGGCCAAGCC ACTGGTACGC CAAAGATGGC 1440 

ATCAGCAGCG TTTGCACCAG GGGGAGCAGC CACACCTTGG AGGACAGGGA GGGTGCGGAC 150 0 

20 

GTCGACGGCA GCAAAACGTG GCTGGAGCAA GTTGCCGTCG CGTGCCGGCC TCGGCGAGCG 1560 

CGAGCGGCTG TAGGAGCGCT CGGTGCCCTC AGACTCGGAC AGTGCGCCAG TGGGAGAGCC 162 0 

2 5 ATGGCGACGG CGGCCACCAC TGGACGTGCC ATGGCGCTGG TCGTGACGGC GCGTGGATGG 1680 

CCCGTCCTCG CGGGCAGCTC CACCTGAGGG GCACCGGAGG AGCACACCCC GCCAAGCTGG 17 40 

GCCAGGGCGG CTGCGGCGAC GGCGACGGGC GCGGTCGCGG TGTGCACCAT CATCTTCATC 18 00 

30 

TTGGTCATCG TGGCGCCTCG GACAAGGATG CTCGCTGTCA CCGACGCGAG GGACGTGAGC 1860 

CGGCTCAGCC CGCCCTTCCT CGACGTGGCG AGCCCTGCGG ATATGCTCCT CGAGCGGCCA 192 0 

3 5 TTGGGGGTCG TTGGCGCGCG GCATCTCGGG GTCGCGGTCA GGTATCGGGG TGTAGTCCTT 19 80 

TGTGGTGTGC AGGTGGATGA GCAGAGAGAA ATCCGGCCCC TCTAGCCCCT CGTCCCGGGG 2 04 0 

GGAGCCCTCC GGGAGGGTCT GGCGGCCCGT GGGGTCCAGG GGTCGATCGA TGATGGAGAA 2100 

40 

CCCCCTTTTG GTGGGGATGT CGTCCGGACT CCATGCGCAC ACCCAGGCAA AGAGGCAGGC 216 0 

CGTGTTGGAG AGGGAGGTCG TCTGCCGCTG CAACCAGTCG ACGTGGGATG TCTTCCCGAG 22 2 0 

4 5 GGGATCCTGC CCCGCCTCCT TGTTCCAGGA CTGCAGCGGC ATGTTCTCGA CGGCGATGCG 22 80 

GCAGTAGTAC CGCCAGACAC GGCGGTGGCC GTGTGCGGAT GGTGACGAGG CCGACAGGGA 23 40 

GAGCGCGACG GCCCAGCAGG AGACGACCCC AGCGTCGAAA GCGATGTCCC GGTGCCTGAA 2400 

50 

GTGGACGAGC CCAGAGATGG CCAGGCGCAT TGACGCGGGG AAGGGGAAGG AGTTAGGATG 24 6 0 

GGCGAGGCGG CCGGAGTGAA CCGCGGCGTG GTGGGCGACG GGGCTGGAGA GGCAGAGGCG 2 52 0 

5 5 GAGTCATCCG AGAGAGGTGT ATCAGTGGCT CTGCACAATA CCCAGTGTCG CCACATCATA 2 58 0 

TCCTGGTGAA TAAGCACACA TGTGTACTGT CGTTAAATAA ATCATTGGTC ACGCGAAGCG 2640 

GGAAAAAGAC GGCGAAAAAT TCACGGAGAC ACGACTAGTA GTACCCAATA TACTCGGCAA 27 00 

60 

AAACAGTGAG AGGTCGTTTT GCGTTGTCGG CCGGTGTTGT CGAGTCATTG TACTATGTTT 27 6 0 

TGTCGTTTCT TTCTTTTCTC CAAATCGACA AACCGTTTGT CTTTGGTTAA AAAACAGAAA 28 2 0 

65 CATACAAAAT CAAATGAATG CATTCAAGGG GCGGTAATCC AATTGTGAGC CCAGGCTCAG 2 880 

GTACACGCGC GGTTACAAAA AAATCAAAAT AAATACTAGA AAAATTCAAA AAATTCCAAT 29 4 0 
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TTGTTTGTGC GTGGTAGATA ATTTGATGCG TGAGGTACGC TTCAATTTTC AAATTATTTG 3 000 

GACATGTGAG CAGCTCTCAG CAAAAAAGAC AAATTCGGGG TCTGTAAAAA TGTTTACTGT 3 0 60 

5 

TCATGCACTG TTCTGACCCG ATTTGTCTTT TTTGCTGAGA GCTTCTCAGA AGTCCAAATG 312 0 

AGCTAAAATT TTGAGCGGAG GTTACGTGAT AAAATGTCTA TCATGCAAAA AAGGATTGGA 3180 

10 ATTTTTTGAA TTTTTTTTAT TTTTTGTGAT TTGTTTCCTG GACGGGTGCA GATAAGCCTG 3 24 0 

GGCACCGAAA CGCCGCACTC AGGCTCATCC TTTTCTATAA AAGAAAAGAA AT AC AT AC AA 3 3 00 

TTTCCCTCTG TTTTTTGAGC AAGGGGCACC ACCCACCAAA GAGTTTTCAA CTCACATGGT 3 3 60 

15 

ATTAGAGCAT CTACAGCCGG GCGTCTCAAA CCAGCCTCAT ACGCTTGAGC GGGTCGCCTT 3 42 0 

GGTCACGATT TTTTGACCCA GACGGGCCCC TCAAACGGTC CTTAAACGCC CAGGCTGACC 3 480 

2 0 GACAACCCAC ATATCCAGCC CAAATATGGG GTGGATATGG GGGCGCCCGG GCACGCCAGC 3 540 

CCGCGGACAC CACACATCTT CAGTTTCTAA TTTGAGATAT CCGGATGTGG AATGCGTTTT 3 600 

TGAGGGGTGA CCGGTCCCTG TCCGTGGATG CGCCCGGACG TTTGAGGGGT TGGATTTGCC 3 6 60 

25 

AAGTCTGATT AGAGATGCTC TTAGGTGTTC CACCCCCATC CCTTGATGGC TAGGGCAAAC 3 72 0 

TCTCCCCTCC AAACTTTGTC GGCGAGCCTG TGGATTCTTC TCTCCTCTGC CCGCTGCTCC 3 7 80 

3 0 GGCGGCTGAT GGCGGGGAGG AGAATCCCGG TGTCTTCGCT TGGTTAGTTG TTTAAGTTAC 3 84 0 

GTACTTTTTT AGTCCTCGCA GGTGCGGCGT TCGGACGTAT GGTCGTGCTT CTTTTTTGAG 3 900 

TTTGTCTTCC GGGCTCTGAT CCTCCTCGAG TTCGTCCATC TGGACGTACT CGACGGAGCT 3 960 

35 

C CGGC AT AG A TTCCTATCAT CGTCTTGGTG AGGTGAGGTT ATGGTTTCTT GTCATGTGGG 4 02 0 

CAGATTTGGT GCCAGATGCT TCATATCTAT TCAAGGGTTC AGCGGCAACA ACTGCGGCTC 4 08 0 

4 0 CAGAGCGATG GTCCTTAAGG GCACGTGCAC GAAGACTTCA CGGCTGTTAT CGACAAGGTC 4140 

AAGCCGGCTC CGATAGGGGA GCAGCGACAG CGGCGCGTCA ACCGCTCGTT CTGGCGGCAG 4 2 00 

TAGTGGTCGT TCGGTGCTCT CGGAACCTCG ATGTAATTTT TATGATTTTA GAGATGCTTT 42 6 0 

45 

GTACTTCCGA TCGATGAACT CTGATAATAG ATATCTCTTC TCTCGCAAAA AAAG AG AG TT 4 3 20 

TTCAACTGAA AACAAAAGAG TTTCACTAGT TCTTCTTTTA GAAACAGAGT TTCACTAGCA 43 8 0 

5 0 CTTTTTTTTG CGAGAAGTCG AGTTTCACTA AGTACTAAAC CCACGCAATT ATTCTCAAAA 4 44 0 

AAAAAACCCA CGCAACTGTC TGGATCCATC TTCGTTTTTT CCCCGAGAAT CGTCTGGATC 4 5 00 

CATTTTCGTG TGCGAGGCAT CCTCTCATTT TGCACGGCCC AGCTCTCTTC TCGCCGGCGT 4 5 60 

55 

ACGCTGCTAC ATGTCGGCAC TCCACGCAAA CAAAAAGAAG CCC AACCGAA AACGCACGCG 4 62 0 

CCTTTCCAGG CTCACCACGG AAAAAAATAC CACGCGCCGC TCACGAGCAA ACCGTGACAA 4 680 

60 CAGCCAGCCA GATATGGCAA CGGAGGCACG GGCCGCACAC AGCCACTGAA AACCGCAGCT 47 4 0 

GCTCTTCCGT CCGTCCGTCC CTCCGCCCGT CCGCGCCACT CCACTCGCCT TGCCCCACTC 4 8 00 

CCACTCTTCT CTCCCCGCGC ACACCGAGTC GGCACCGGCT CATCACCCAT CACCTCGGCC 4 8 60 

65 

TCGGCCACCG GCAAACCCCC CGATCCGCTT TTGCAGGCAG CGC ACTAAAA CCCCGGGGAG 49 2 0 
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CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 49 80 

CGAGCGGGGC GATCCACCGT CCGTGCGTCC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 50 4 0 

5 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG 5 07 2 

(2) INFORMATION FOR SEQ ID NO: 16: 
(0 SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1706 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: cDN A 

(iii) HYPOTHETICAL. NO 



(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
2 0 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.. 1706 

2 5 (D) OTHER INFORMATION:/product= "partial cDNA for 

hexaploid wheat DBE" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16; 

3 0 GCT GTG TCG AAG CTT GAC TAT TTG AAG GAG CTT GGA GTT AAT TGT ATT 4 8 

Ala Val Ser Lys Leu Asp Tyr Leu Lys Glu Leu Gly Val Asn Cys lie 
15 10 15 

GAA TTA ATG CCC TGC CAT GAG TTC AAC GAG CTG GAG TAC TCA ACC TCT 9 6 

3 5 Glu Leu Met Pro Cys His Glu Phe Asn Glu Leu Glu Tyr Ser Thr Ser 

20 25 30 

TCT TCC AAG ATG AAC TTT TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA 144 
Ser Ser Lys Met Asn Phe Trp Gly Tyr Ser Thr lie Asn Phe Phe Ser 
40 35 40 45 

CCA ATG ACG AG A TAC AC A TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT 192 

Pro Met Thr Arg Tyr Thr Ser Gly Gly lie Lys Asn Cys Gly Arg Asp 

50 55 60 

45 

GCC ATA AAT GAG TTC AAA ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA 2 40 

Ala lie Asn Glu Phe Lys Thr Phe Val Arg Glu Ala His Lys Arg Gly 

65 70 75 80 

5 0 ATT GAG GTG ATC CTG GAT GTT GTC TTC AAC CAT AC A GCT GAG GGT AAT 2 88 
lie Glu Val lie Leu Asp Val Val Phe Asn His Thr Ala Glu Gly Asn 

85 90 95 

GAG AAT GGT CCA ATA TTA TCA TTT AGG GGG GTC GAT AAT ACT ACA TAC 3 36 
55 Glu Asn Gly Pro lie Leu Ser Phe Arg Gly Val Asp Asn Thr Thr Tyr 

100 105 110 

TAT ATG CTT GCA CCC AAG GGA GAG TTT TAT AAC TAT TCT GGC TGT GGG 3 84 
Tyr Met Leu Ala Pro Lys Gly Glu Phe Tyr Asn Tyr Ser Gly Cys Gly 
60 115 120 125 
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AAT ACC 
Asn Thr 
130 

5 TGT TTA 
Cys Leu 
145 

GAT CTT 
10 Asp Leu 



AAC GTG 
Asn Val 

15 

CCT CTT 
Pro Leu 

20 

CTT GGA 
Leu Gly 
210 

2 5 TAT CAA 

Tyr Gin 
225 

GGG AAG 

3 0 Gly Lys 



TTT GCT 
Phe Ala 

35 

CAG GCA 
Gin Ala 

40 

CAT GAT 
His Asp 
290 

45 AAT TTA 
Asn Leu 
305 

AGC TGG 
5 0 Ser Trp 



AGA TTG 
Arg Leu 

55 

TCT CAA 
Ser Gin 

60 

AAA GGG 
Lys Gly 
370 



TTC AAC TGT 
Phe Asn Cys 



AGA TAC TGG 
Arg Tyr Trp 



GCA TCC ATA 
Ala Ser lie 
165 

TAT GGA GCT 
Tyr Gly Ala 
180 

GTT ACT CCA 
Val Thr Pro 
195 

GGC GTC AAG 
Gly Val Lys 



GTA GGT CAA 
Val Gly Gin 



TAC CGG GAC 
Tyr Arg Asp 
245 

GGT GGT TTT 
Gly Gly Phe 
260 

GGA GGA AGG 
Gly Gly Arg 
275 

GGA TTT ACA 
Gly Phe Thr 



CCA AAT GGG 
Pro Asn Gly 



AAT TGT GGG 
Asn Cys Gly 
325 

AGG AAG AGG 
Arg Lys Arg 
340 

GGA GTT CCA 
Gly Val Pro 
355 

GGC AAC AAC 
Gly Asn Asn 



AAT CAT CCT 

Asn His Pro 
135 

GTG ATG GAA 

Val Met Glu 
150 

ATG ACC AGA 

Met Thr Arg 



CCA ATA GAA 
Pro lie Glu 



CCA CTT ATT 
Pro Leu lie 
200 

CTC ATT GCT 
Leu lie Ala 
215 

TTC CCT CAC 
Phe Pro His 
230 

ATT GTG CGC 
lie Val Arg 



GCC GAA TGT 
Ala Glu Cys 



AAA CCT TGG 

Lys Pro Trp 
280 

CTG GGT GAT 

Leu Gly Asp 
295 

GAG AAC AAT 

Glu Asn Asn 
310 

GAG GAA GGA 

Glu Glu Gly 



CAG ATG CGC 
Gin Met Arg 



ATG TTT TAC 
Met Phe Tyr 
360 

AAT ACA TAC 
Asn Thr Tyr 
375 



101 - 

GTG GTT CGT 
Val Val Arg 



ATG CAT GTT 

Met His Val 
155 

GGT TCC AGT 

Gly Ser Ser 
170 

GGT GAC ATG 

Gly Asp Met 
185 

GAC ATG ATC 

Asp Met lie 



GAA GCA TGG 
Glu Ala Trp 



TGG AAT GTT 
Trp Asn Val 
235 

CAA TTC ATT 
Gin Phe lie 
250 

CTT TGT GGA 
Leu Cys Gly 
265 

CAC AGT ATC 
His Ser lie 



TTG GTA ACA 
Leu Val Thr 



AGA GAT GGA 
Arg Asp Gly 
315 

GAA TTC GCA 
Glu Phe Ala 
330 

AAT TTC TTT 
Asn Phe Phe 
345 

ATG GGC GAT 
Met Gly Asp 



TGC CAT GAT 
Cys His Asp 



CAA TTC ATT 
Gin Phe lie 
140 

GAT GGT TTT 
Asp Gly Phe 



CTG TGG GAT 
Leu Trp Asp 



ATC ACA ACA 
lie Thr Thr 
190 

AGC AAT GAC 
Ser Asn Asp 
205 

GAT GCA GGA 
Asp Ala Gly 
220 

TGG TCT GAG 
Trp Ser Glu 



AAA GGC ACT 
Lys Gly Thr 



AGT CCA CAC 

Ser Pro His 
270 

AAC TTT GTA 

Asn Phe Val 
285 

TAT AAT AAC 

Tyr Asn Asn 
300 

GAA AAT CAC 

Glu Asn His 



AGA TTG TCT 
Arg Leu Ser 



GTT TGT CTC 
Val Cys Leu 
350 

GAA TAT GGC 
Glu Tyr Gly 
365 

TCT TAT GTC 
Ser Tyr Val 
380 



GTA GAT 43 2 
Val Asp 



CGT TTT 48 0 
Arg Phe 
160 

CCA GTT 52 8 

Pro Val 

175 

GGG ACA 57 6 
Gly Thr 



CCA ATT 62 4 
Pro lie 



GGC CTC 67 2 
Gly Leu 



TGG AAT 720 
Trp Asn 
240 

GAT GGA 768 

Asp Gly 

255 

CTA TAC 816 
Leu Tyr 



TGT GCA 8 64 
Cys Ala 



AAG TAC 912 
Lys Tyr 

AAT CTT 9 60 
Asn Leu 
320 

GTC AAA 100 8 

Val Lys 

335 

ATG GTT 1056 
Met Val 



CAC ACA 1104 
His Thr 



AAT TAT 1152 
Asn Tyr 
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TTT CGC TGG GAT AAA AAA GAA CAA TAC TCT GAC TTG CAC AGA TTC TGC 12 00 

Phe Arg Trp Asp Lys Lys Glu Gin Tyr Ser Asp Leu His Arg Phe Cys 

385 390 395 400 

5 

TGC CTC ATG ACC AAA TTC CGC AAG GAG TGC GAG GGT CTT GGC CTT GAG 12 4 8 

Cys Leu Met Thr Lys Phe Arg Lys Glu Cys Glu Gly Leu Gly Leu Glu 

405 410 415 

10 GAC TTT CCA ACG GCC GAA CGG CTG CAG TGG CAT GGT CAT CAG CCT GGG 12 9 6 
Asp Phe Pro Thr Ala Glu Arg Leu Gin Trp His Gly His Gin Pro Gly 

420 425 430 

AAG CCT GAT TGG TCT GAG AAT AGC CGA TTC GTT GCC TTT TCC ATG AAA 13 44 
15 Lys Pro Asp Trp Ser Glu Asn Ser Arg Phe Val Ala Phe Ser Met Lys 

435 440 445 

GAT GAA AGA CAG GGC GAG ATC TAT GTG GCC TTC AAC ACC AGC CAC TTA 13 92 
Asp Glu Arg Gin Gly Glu lie Tyr Val Ala Phe Asn Thr Ser His Leu 
20 450 455 460 

CCG GCC GTT GTT GAG CTC CCA GAG CGC GCA GGG CGC CGG TGG GAA CCG 14 40 

Pro Ala Val Val Glu Leu Pro Glu Arg Ala Gly Arg Arg Trp Glu Pro 

465 470 475 480 

25 

GTG GTG GAC AC A GGC AAG CCA GCA CCA TAT GAC TTC CTC ACC GAC GAC 1488 

Val Val Asp Thr Gly Lys Pro Ala Pro Tyr Asp Phe Leu Thr Asp Asp 

485 490 495 

3 0 TTA CCT GAT CGC GCT CTC ACC ATA CAC CAG TTC TCT CAT TTC CTC AAC 153 6 
Leu Pro Asp Arg Ala Leu Thr lie His Gin Phe Ser His Phe Leu Asn 

500 505 510 

TCC AAC CTC TAC CCC ATG CTC AGC TAC TCA TCG GTC ATC CTA GTA TTG 1584 
3 5 Ser Asn Leu Tyr Pro Met Leu Ser Tyr Ser Ser Val lie Leu Val Leu 

515 520 525 

CGC CCT GAT GTT TGA GAG ACA AAT ATA TAC AGT AAA TAA TAT GTC TAT 163 2 
Arg Pro Asp Val * Glu Thr Asn lie Tyr Ser Lys * Tyr Val Tyr 
40 530 535 540 

ATG TAG TCC TTT GGC GTA TTA TCA GTG TGC ACA ATT GCT CTA TTG CCA 16 80 
Met * Ser Phe Gly Val Leu Ser Val Cys Thr lie Ala Leu Leu Pro 
545 550 555 560 

45 

GTG ATC TAT TCG ATA GCG GCC GCG AA 17 06 

Val lie Tyr Ser lie Ala Ala Ala 

565 

5 0 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
5 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

60 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 ..9289 

(D) OTHER INFORMATION:/product= "genomic sequence of DBE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CGG GAC CGT CCC TTG GCA ACT TGG GTT ACG TTG GGA CCT GAC GCT TCG 4 8 
Arg Asp Arg Pro Leu Ala Thr Trp Val Thr Leu Gly Pro Asp Ala Ser 
570 575 580 

15 CTT ATC CGG TGT GCC CTG AGA CGA GAT ATG TGC AGC TCC TAT CGG ATT 9 6 
Leu lie Arg Cys Ala Leu Arg Arg Asp Met Cys Ser Ser Tyr Arg lie 
585 590 595 600 

TGT CGG CAC ATT CGG CGG CTT TGC TGG TCT TGT TTT ACC ATT GTC GAA 144 
20 Cys Arg His lie Arg Arg Leu Cys Trp Ser Cys Phe Thr lie Val Glu 

605 610 615 

ATG TCT TAT AAA CCG GGA TTC CGA GAC TGA TCG GGT CTT CCC GGG AGA 192 
Met Ser Tyr Lys Pro Gly Phe Arg Asp * Ser Gly Leu Pro Gly Arg 
25 620 625 630 

AGG TTT ATC CTT CGT TGA CCG TGA GAG CTT ATA ATG GGC TAA GTT GGG 2 40 

Arg Phe lie Leu Arg * Pro * Glu Leu He Met Gly * Val Gly 

635 640 645 

30 

ACA CCC CTG CAG GGT ATT ATC TTT CGA AAG CCG TGC CCG CGG TTA TGA 2 88 

Thr Pro Leu Gin Gly He He Phe Arg Lys Pro Cys Pro Arg Leu * 

650 655 660 

3 5 GGC AGA TGG GAA TTT GTT AAT GTC CGA TTG TAG AGA ACC TGT CAC TTG 3 36 
Gly Arg Trp Glu Phe Val Asn Val Arg Leu * Arg Thr Cys His Leu 
665 670 675 680 

ACT TAA TTT AAA ATT CAT CAA CCG TGT GTG TAG CCG TGA TGG TCT CTT 3 84 
40 Thr * Phe Lys He His Gin Pro Cys Val * Pro * Trp Ser Leu 

685 690 695 

TTC GGC GGA GTC CGG GAA GTG AAC ACG GTT TGA GTT ATG CAT GAA CGT 43 2 
Phe Gly Gly Val Arg Glu Val Asn Thr Val * Val Met His Glu Arg 
45 700 705 710 

AAG TAG TTT CAG GAT CAC TCC TTG ATC ACT TCT AGC TCC GCG ACC GTT 480 

Lys * Phe Gin Asp Hxs Ser Leu He Thr Ser Ser Ser Ala Thr Val 

715 720 725 

50 

GCG TTG TTT CTC TTC TCG CTC TCA TTT GCG TAT GTT AGC CAC CAT ATA 52 8 

Ala Leu Phe Leu Phe Ser Leu Ser Phe Ala Tyr Val Ser His His He 

730 735 740 

55 TGC TTA GTG TCT GCT GCA GCT CCA CCT CAT TAC CCC TTC CTT TCC TAT 57 6 
Cys Leu Val Ser Ala Ala ALi Pro Pro His Tyr Pro Phe Leu Ser Tyr 
745 750 755 760 

AAG CTT AAA TAG TCT TGA TCT CGC GGG TGT GAG ATT GCT GAG TCC TCG 624 
60 Lys Leu Lys * Ser * Ser Arg Gly Cys Glu He Ala Glu Ser Ser 

765 770 775 
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TGA CTT AC A GAT TCT ACC AAA AC A GTT GCA GGT GTC GAC GAT GCC AGT 67 2 

k Leu Thr Asp Ser Thr Lys Thr Val Ala Gly Val Asp Asp Ala Ser 

780 785 790 

5 GCA GGT GAC GCA ACC GAG CTC AAG TGG GAG TTC GAC GAG GAA CGT GGT 72 0 

Ala Gly Asp Ala Thr Glu Leu Lys Trp Glu Phe Asp Glu Glu Arg Gly 

795 800 805 

CGT TAC TAT GTT TCT TTT CCT GAT GAT CAG TAG TGG AGC CCA GTT GGG 7 68 

10 Arg Tyr Tyr Val Ser Phe Pro Asp Asp Gin * Trp Ser Pro Val Gly 

810 815 820 

ACG ATC GGG GAT CTA GCA TTT GGG GTT ATC TTA ATT TCT TTT AGA TTT 816 

Thr lie Gly Asp Leu Ala Phe Gly Val lie Leu lie Ser Phe Arg Phe 

15 825 830 * 835 840 

GAC CGT AAT CGG TCT ATG TGT GGA TTT TGG ATG ATG TAT GAA TTA TTT 864 

Asp Arg Asn Arg Ser Met Cys Gly Phe Trp Met Met Tyr Glu Leu Phe 

845 850 855 

20 

ATG TAT TGT GTG AAG TGG CGA TTG TAA GCC AAC TCT CGT TAT CCC ATT 912 

Met Tyr Cys Val Lys Trp Arg Leu * Ala Asn Ser Arg Tyr Pro lie 

860 865 870 

2 5 CTT GTT CAT TAC ATG GGA TTG TGT GAA GAT GAC CCT TCT TGC GAC AAA 9 60 

Leu Val His Tyr Met Gly Leu Cys Glu Asp Asp Pro Ser Cys Asp Lys 

875 880 885 

ACC AC A ATG CGG TTA TGC CTC TAA GTC GTG CCT CGA CAC GTG GGA GAT 1008 

3 0 Thr Thr Met Arg Leu Cys Leu * Val Val Pro Arg His Val Gly Asp 

890 895 900 

ATA GCC GCA TCG TGG GCG TTA CAC GCA AGT CTT CAT AGC AAC CAA AAC 10 56 

lie Ala Ala Ser Trp Ala Leu His Ala Ser Leu His Ser Asn Gin Asn 

35 905 910 915 920 

TCC TCT CCG CAT TAC AAG CCA CCA ATC GCA GCC ACC ATG ACT TTC TTC 1104 

Ser Ser Pro His Tyr Lys Pro Pro lie Ala Ala Thr Met Thr Phe Phe 

925 930 935 

40 

ACC ACT GTC AAT GCC ATG AAA ATC TAT ATG TAG AC A TGT CCC ATT GCA 1152 

Thr Thr Val Asn Ala Met Lys lie Tyr Met * Thr Cys Pro lie Ala 

940 945 950 

4 5 TCG GCA AGA AAG CGA AGC TTC ACG GCA CAC CTT CAT GAA GCC TCT CTG 12 00 

Ser Ala Arg Lys Arg Ser Phe Thr Ala His Leu His Glu Ala Ser Leu 

955 960 965 

GCC GAA GAC AAG GAT GCG CCC GAC CGG ATC AAT TCC TAT CTA GAT ACC 124 8 

5 0 Ala Glu Asp Lys Asp Ala Pro Asp Arg lie Asn Ser Tyr Leu Asp Thr 

970 975 980 

TAG TGG AGC CAT GCG CCA ATA GCG GAG ATC TCC GAG AGG AAG ACC GGA 12 9 6 

Trp Ser His Ala Pro lie Ala Glu lie Ser Glu Arg Lys Thr Gly 

55 985 990 995 1000 

ACT CGT CGG ACG TCG GCG TCC AAA TCG AGG AGG CCG GCA TGA AGC AC A 13 44 

Thr Arg Arg Thr Ser Ala Ser i.ys Ser Arg Arg Pro Ala * Ser Thr 

1005 1010 1015 



60 



TCG AGG ATG GTG ATC CCC ATA CGG GTA GAT CGG GTC GGC CGC CAT CTC 13 9 2 
Ser Arg Met Val lie Pro lie Arg Val Asp Arg Val Gly Arg His Leu 

1020 1025 1030 
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AC A CCG AG A TTA GGA TGC TTA AAA CGG TTT TTT TGG CAC TAG CAT TAT 144 0 

Thr Pro Arg Leu Gly Cys Leu Lys Arg Phe Phe Trp His * His Tyr 

1035 1040 1045 

5 TTT GCA TCA TCC GTT GGA GAG AAC ATG AGA GAG CCC CAT TTC TTC CAC 1488 
Phe Ala Ser Ser Val Gly Glu Asn Met Arg Glu Pro His Phe Phe His 
1050 1055 1060 

GGT TCT ACC TAT GGG ATC TTG TTC TGC TTG CAA CCG GGC CTC ACG GAA 153 6 
10 Gly Ser Thr Tyr Gly He Leu Phe Cys Leu Gin Pro Gly Leu Thr Glu 
1065 1070 1075 1080 

AAC CCG CGC CAG CGG ACC CAC CCC ATG CTA GCA GGG CAC GGC ACC CGC 1584 
Asn Pro Arg Gin Arg Thr His Pro Met Leu Ala Gly His Gly Thr Arg 
15 1085 1090 1095 

AGC GGC CGG TCC AAA TGG ACG GTG AGA ACC GCA ACG CGA CAC GCC CGG 163 2 

Ser Gly Arg Ser Lys Trp Thr Val Arg Thr Ala Thr Arg His Ala Arg 

1100 1105 1110 

20 

CAC TGT CAG CAA AGC GAG AGC GCG CGC ACG GCA CAC GCA CGC TCG GAC 1680 

His Cys Gin Gin Ser Glu Ser Ala Arg Thr Ala His Ala Arg Ser Asp 

1115 1120 1125 

2 5 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 17 2 8 

Glu Arg Thr Val Arg Ser He Pro Pro Pro Ser Leu Asn His Ser Ser 
1130 1135 1140 

ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 177 6 

3 0 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 

1145 1150 1155 1160 

AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 18 2 4 
Asn Gin Gin Glu Ala Arg He Pro Pro He Asn Asn Pro Ala Ser Pro 
35 1165 1170 1175 

CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 187 2 

Leu Leu Pro Lys lie Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 

1180 1185 1190 

40 

ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 19 2 0 

Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 
1195 1200 1205 

45 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 19 6 8 
Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 
1210 1215 1220 

CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2 016 
50 Arg Trp Arg Pro Asn Ala Thr Ala Gly Lys Gly Val Gly Glu Val Cys 
1225 1230 1235 1240 

GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2 06 4 
Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 
55 1245 1250 1255 

GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 
Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 

1260 1265 1270 



60 



AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 216 0 
Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 
1275 1280 1285 
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GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 22 08 

Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 

1290 1295 1300 

5 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 225 6 

Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 

1305 1310 1315 1320 

GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 23 0 4 

10 Glu * Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 

1325 1330 1335 

CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2 3 52 

Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly * Gly Asp 

15 1340 1345 1350 

GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2 40 0 

Gly Gly Gly Phe Pro * Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 

1355 1360 ' 1365 

20 

GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 2 448 

Ala Cys Leu His * Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 

1370 1375 1380 

2 5 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 24 9 6 

Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro * Tyr Phe 

1385 1390 1395 1400 

CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 25 44 

3 0 Gin Cys Arg Gly Gly Ser Leu Cys * Gly Asp His Thr Leu Ala Leu 

1405 1410 1415 

CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2 5 92 

Pro Ala Ser Trp Tyr Leu Gin * Lys Leu Leu Arg Gly Pro Leu Phe 

35 1420 1425 1430 

GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2 64 0 

Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 

1435 1440 1445 

40 

CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2 688 

Arg Ser Gly Ala Trp * Gin Leu Leu Ala Ser Asp Gly Trp His Asp 

1450 1455 1460 

4 5 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 273 6 

Pro Ser Ser lie * His Gly Met Pro Asp Cys * Lys Tyr Trp Leu 

1465 1470 1475 1480 

CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 2784 

50 His Leu Phe Leu Ser Phe Ser His lie Phe L eu Leu Ser Phe Thr Cys 

1485 1490 1495 

ACT ACA TTG CCT CAG ACA GTC ATG ATC AAA GAG AGC AGT GTC ATT AGA 283 2 

Thr Thr Leu Pro Gin Thr Val Met lie Lys Glu Ser Ser Val lie Arg 

55 1500 1505 1510 

CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 288 0 
His Leu * Leu Ser Ala Asp Phe Asp Gin Asn Leu * Phe Thr Val 

1515 1520 1525 



60 



GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2 928 
Val Lys Gly Pro * lie lie Phe Phe Tyr Asn lie Met Phe Ala Ser 
1530 1535 1540 
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GGA AGT AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 297 6 

Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 
1545 1550 1555 1560 

5 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 3 02 4 
Val * Leu Asp Met Gin * Lys Gly Leu His Leu Gin Phe Asp Trp 

1565 1570 1575 

GAA GGC GAC CTA CCT CTA AGA TAT CCT CAA AAG GAC CTG GTA ATA TAT 3 07 2 
10 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val lie Tyr 

1580 1585 1590 

GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 312 0 
Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 
15 1595 1600 1605 

CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 3168 

His Pro Gly Thr Phe lie Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 
1610 1615 1620 

20 

GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3 216 

Val Gin Leu Tyr Leu Leu Thr Thr * Asp Asn Phe * Arg Lys Leu 

1625 1630 1635 1640 

2 5 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 3 2 64 

His lie Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys lie Leu * 

1645 1650 1655 

CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3 312 

3 0 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu lie Trp 

1660 1665 1670 

ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 3 3 60 
lie * Tyr Pro lie Leu Phe * Tyr Trp Thr Val Pro Ser Thr Gly 
35 1675 1680 1685 

GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3 40 8 

Ala Trp Ser * Leu Tyr * lie Asn Ala Leu Pro * Val Gin Arg 
1690 1695 1700 

40 

GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3 456 

Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr * lie * Tyr 
1705 1710 1715 1720 

45 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3 5 04 
Pro Ala Ser Thr Val * Val Arg Val His Thr His Phe Val Pro 

1725 1730 1735 

GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3 5 52 
50 Ala * Leu lie Phe Val Gin Thr lie Phe Phe Ser Ser His Ser Thr 

1740 1745 1750 

GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3 600 
Val Leu His He Tyr He He * Thr He Arg His Pro Gly * Gly 
55 1755 1760 1765 

ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 3 64 8 
He Val He Leu His Pro Pro Leu Phe * His Leu Cys Thr Val He 
1770 1775 1780 



60 



TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3 69 6 
Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr * 
1785 1790 1795 1800 
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GAA AAC CTA TAA TCG TCG TAA AAA AAA ATA TGT TAC GTA AAA TTA CAA 3 74 4 

Glu Asn Leu * Ser Ser * Lys Lys lie Cys Tyr Val Lys Leu Gin 

1805 1810 1815 

5 ATG TAA AAA CAT AGT GTA AAA TGT AC A TAA AAT ACA TTT TTT GAC CTA 3792 
Met * Lys His Ser Val Lys Cys Thr * Asn Thr Phe Phe Asp Leu 

1820 1825 1830 

TAT TTT TTT TGT TAA TGC CAA ATT TTA TAC AGT AAA TCA ATA TGA ATG 3 84 0 
10 Tyr Phe Phe Cys * Cys Gin He Leu Tyr Ser Lys Ser He * Met 

1835 1840 1845 

TAA CTA TTT GTA TTT CAA ATG TAA TTT ATT TAT GAA ATG GTC GTA AGA 3 888 

* Leu Phe Val Phe Gin Met * Phe He Tyr Glu Met Val Val Arg 
15 1850 1855 1860 

TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA TAG 3 93 6 

Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu * 

1865 1870 1875 1880 

20 

TAA CAC TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA CCG GCT 3 9 84 

* His Tyr He Tyr He Tyr He Tyr lie Tyr He Tyr He Pro Ala 

1885 1890 1895 

2 5 GCT GCT AAT GAT GTT AAT ATT TCG CAA GTA CCT AAG CTG GAT TTT TCT 40 3 2 

Ala Ala Asn Asp Val Asn He Ser Gin Val Pro Lys Leu Asp Phe Ser 

1900 1905 1910 

CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG 40 80 

3 0 Pro * Asp He Asn Pro * Leu Lys Leu Val Thr Thr Val Glu * 

1915 1920 1925 

TTG ATA GCT GAA AAT GAA ATC CAG CAT GCT ACT GTC TTG CCA TCT CCA 412 8 
Leu He Ala Glu Asn Glu He Gin His Ala Thr Val Leu Pro Ser Pro 
35 1930 1935 1940 

GAC TTG CTA ACA TGA ATT TTG TCT GCC TAC CTG TCA TTT GTA CCA ACG 4176 

Asp Leu Leu Thr * He Leu Ser Ala Tyr Leu Ser Phe Val Pro Thr 

1945 1950 1955 1960 

40 

TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT 422 4 

Phe Pro He Ala Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 

1965 1970 1975 

45 AAC ATG ATT ATT GTT GGC TAT ATT TCT CTT TGG AAA CAT GAC TAA TTT 4 272 
Asn Met He lie Val Gly Tyr He Ser Leu Trp Lys His Asp * Phe 

1980 1985 1990 

ATC ACC CGT TTT GTA TAA ACT GCT TGT TTT CAT ATC AGG ATG AAC TTT 43 2 0 
50 He Thr Arg Phe Val * Thr Ala Cys Phe His He Arg Met Asn Phe 

1995 2000 2005 

TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA CCA ATG ACG AGA TAC ACA 43 68 
Trp Gly Tyr Ser Thr He Asn Phe Phe Ser Pro Met Thr Arg Tyr Thr 
55 2010 2015 2020 

TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT GCC ATA AAT GAG TTC AAA 4416 
Ser Gly Gly He Lys Asn Cys Gly Arg Asp Ala He Asn Glu Phe Lys 
2025 2030 2035 2040 



60 



ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA ATT GAG GTA AGC AAG TCG 446 4 
Thr Phe Val Arg Glu Ala His Lys Arg Gly He Glu Val Ser Lys Ser 

2045 2050 2055 
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TAC GAG TTA GTT GCT CCT TTT GAA CTT ATC AAT TTG ATG CGA AGA CAT 4512 

Tyr Glu Leu Val Ala Pro Phe Glu Leu lie Asn Leu Met Arg Arg His 

2060 2065 2070 

GTT ACT GCT AGG TGA TCC TGG ATG TTG TCT TCA ACC ATA CAG CTG AGG 45 6 0 

Val Thr Ala Arg * Ser Trp Met Leu Ser Ser Thr lie Gin Leu Arg 

2075 2080 2085 

GTA ATG AGA ATG GTC CAA TAT TAT CAT TTA GGG GGG TCG ATA ATA CTA 46 0 8 

Val Met Arg Met Val Gin Tyr Tyr His Leu Gly Gly Ser lie lie Leu 

2090 2095 2100 



CAT ACT ATA TGC TTG CAC CCA AGG TGA CAG ATC TTT CTT GCT GCG TAA 46 5 6 
His Thr He Cys Leu His Pro Arg * Gin He Phe Leu Ala Ala * 
15 2105 2110 2115 2120 



TTG TTC TTT CAT AGA TGT ATA GAG CAT AGA TGT GTT ATG TAG TAG TTC 47 0 4 

Leu Phe Phe His Arg Cys He Glu His Arg Cys Val Met * * Phe 

2125 2130 2135 

TTT TTC AAG GGG ATT ATG TTC ATG CAG GGA GAG TTT TAT AAC TAT TCT 47 5 2 

Phe Phe Lys Gly He Met Phe Met Gin Gly Glu Phe Tyr Asn Tyr Ser 

2140 2145 2150 



2 5 GGC TGT GGG AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC 
Gly Cys Gly Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe 
2155 2160 2165 



4800 



ATT GTA GAT TGT TTA AGG TAC AGA TAT AC A TTT TAC TTC TAG AAC TAC 4 84 8 

3 0 He Val Asp Cys Leu Arg Tyr Arg Tyr Thr Phe Tyr Phe * Asn Tyr 

2170 2175 2180 

TTT TTC ATT TCT TTT GCT GCT TGT CAT TTT GAT ATG ATT AAT TTG CAA 4896 

Phe Phe lie Ser Phe Ala Ala Cys His Phe Asp Met He Asn Leu Gin 
35 2185 2190 2195 2200 

GCT TGT GGG GGT AAA TCT TTT GGT CAG CAT ATT GTA TCT TTA AAT GTC 4944 

Ala Cys Gly Gly Lys Ser Phe Gly Gin Hxs Tie Val Ser Leu Asn Val 

2205 2210 2215 

40 

ACA AAT ACT AAT GTC CTG GTG CTT ATT GAT TTG GCA TCT TCA AAT TCT 49 9 2 

Thr Asn Thr Asn Val Leu Val Leu He Asp Leu Ala Ser Ser Asn Ser 

2220 2225 2230 

4 5 TCT CCA ATG AAA AGG GAA AAA TCT ACT GTA TGT CTC GTC AAC TAA TTT 504 0 

Ser Pro Met Lys Arg Glu Lys Ser Thr Val Cys Leu Val Asn * Phe 

2235 2240 2245 

ACT TTT GTT TTG CAG ATA CTG GGT GAT GGA AAT GCA TGT TGA TGG TTT 50 8 8 

50 Thr Phe Val Leu Gin He Leu Gly Asp Gly Asn Ala Cys * Trp Phe 
2250 2255 2260 

TCG TTT TGA TCT TGC ATC CAT AAT GAC CAG AGG TTC CAG GTA ATT TGT 513 6 

Ser Phe * Ser Cys He His Asn Asp Gin Arg Phe Gin Val He Cys 
55 2265 2270 2275 2280 

ATT TAT TGT TTG TTT GCG TGT TGC CTT TTC AGA AGA TTC TTA AAA GAA 5184 

He Tyr Cys Leu Phe Ala Cys Cys Leu Phe Arg Arg Phe Leu Lys Glu 

2285 2290 2295 



TGT TTC TTT TAC AAG TCT GTG GGA TCC AGT TAA CGT GTA TGG AGC TCC 52 3 2 
Cys Phe Phe Tyr Lys Ser Val Gly Ser Ser * Arg Val Trp Ser Ser 

2300 2305 2310 
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AAT AG A AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 52 8 0 

Asn Arg Arg * His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 
2315 2320 2325 

ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 53 2 8 

Thr Tyr * His Asp Gin Gin * Pro Asn Ser Trp Arg Arg Gin Gly 
2330 2335 2340 

ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 53 7 6 

Thr Cys Phe lie Gin His Leu Leu Ser Val Cys lie Gin Leu Phe * 
2345 2350 2355 2360 



TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 542 4 

Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 

15 2365 2370 2375 

TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA ACA TGT AGA TAC 5 47 2 

Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys * Thr Cys Arg Tyr 
2380 2385 2390 

20 

TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 5 52 0 

Tyr Tyr He Ser Thr Val Tyr Thr * His He He Ala Ser Leu Gly 
2395 2400 2405 

2 5 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 556 8 

Gly Ser Leu He Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 
2410 2415 2420 

CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5 616 

3 0 His Gly Met Gin Glu Ala Ser He Lys * Val Asn Ser Leu Thr Gly 

2425 2430 2435 2440 

ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 566 4 

Met Phe Gly Leu Ser Gly Met Gly Arg * Gly Thr Cys * Lys Phe 

35 2445 2450 2455 

GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG ACA TAT ATA 5712 

Glu Trp Gin He Leu He Glu He * Leu He Phe Ala Thr Tyr He 

2460 2465 2470 



GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 57 6 0 
Asp Lys Ala Lys * Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 
2475 2480 2485 



4 5 AGA ATT ATC CCG CAT CTG TCT ACA AGA ATG ATA ACA CAT GTG CTG AAT 
Arg He He Pro His Leu Ser Thr Arg Met He Thr His Val Leu Asn 
2490 2495 2500 



5808 



AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 585 6 
Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 
2505 2510 2515 2520 



GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 59 04 

Glu Cys Gin Pro Ser Lys Lys Tyr Leu Ser Phe Leu Gin Glu He Val 
55 2525 2530 2535 

CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5 9 52 

His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr Leu Ser Glu 

2540 2545 2550 



AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6 000 
Lys Trp Met Tyr Leu Asp Val Phe * Phe * He His Pro Phe Leu 
2555 2560 2565 
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TCC ATT TCT GCA AC A AGT AGT TCC GGA CGG AGG GAG TAT CAT TTA ACA 6 04 8 
Ser lie Ser Ala Thr Ser Ser Ser Gly Arg Arg Glu Tyr His Leu Thr 
2570 2575 2580 

5 AAT ATA TGC ATG TTC GAA GTA AAT CCC CAC GAA TAA GCA TAT AAG ACG 6 09 6 
Asn lie Cys Met Phe Glu Val Asn Pro His Glu * Ala Tyr Lys Thr 
2585 2590 2595 2600 

ATA TTG CTT TTT GAC TTG CAA CAC CTA AAC CTC ATT GTT TTC TCC TAG 614 4 
10 He Leu Leu Phe Asp Leu Gin His Leu Asn Leu He Val Phe Ser * 

2605 2610 2615 

GAT TTT GGG TGT TCG AAG CAA GCA GCT GGT GAT ATT TAA TTT ACC TTT 619 2 
Asp Phe Gly Cys Ser Lys Gin Ala Ala Gly Asp He * Phe Thr Phe 
15 2620 2625 2630 

GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA 6 2 40 

Ala Phe lie Cys Ser Leu He * Gly Cys Gly Lys Gly Phe Ser Leu 
2635 2640 2645 

20 

GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG 62 8 8 

Val Val Phe Cys Lys Leu Leu * Phe Met Tyr He Leu Leu He Trp 
2650 2655 2660 

2 5 GCA CTT CCG TAC TGG TCC CAT AGA AGA TAA AAA TGG AAT GAT GTC TGG 63 3 6 
Ala Leu Pro Tyr Trp Ser His Arg Arg * Lys Trp Asn Asp Val Trp 
2665 2670 2675 2680 

CCA ATA ATT GTT GAC AAC ACT GTT GCG CAT TTG ATT TTT ATC AGG GAA 63 8 4 
30 Pro He He Val Asp Asn Thr Val Ala His Leu He Phe He Arg Glu 

2685 2690 2695 

TGG AAA ATT GAA ATC GGT AAG AAA CAT TGC GAT ATT AAG CTT GTA TAT 6 43 2 
Trp Lys He Glu He Gly Lys Lys His Cys Asp He Lys Leu Val Tyr 
35 2700 2705 2710 

GCT AAT GCT GGT GGA TCT TTA AGA GGG AAC ATA TGA TCT CGT GTG CAT 6 4 80 

Ala Asn Ala Gly Gly Ser Leu Arg Gly Asn He * Ser Arg Val His 
2715 2720 2725 

40 

CCA TCT TCA ACT AAA AAA ATA TGT TGC ACA TCT CCC ACG TCA CTT ACT 6 52 8 

Pro Ser Ser Thr Lys Lys He Cys Cys Thr Ser Pro Thr Ser Leu Thr 
2730 2735 2740 

4 5 AGC TAT TTC ATC CAA GTA CTA ACT TGT GTG GTT GTC TCC TCA GTA CCG 6 57 6 
Ser Tyr Phe He Gin Val Leu Thr Cys Val Val Val Ser Ser Val Pro 
2745 2750 2755 2760 

GGA CAT TGT GCG CCA ATT CAT TAA AGG CAC TGA TGG ATT TGC TGG TGG 6 62 4 
50 Gly His Cys Ala Pro He His * Arg His * Trp He Cys Trp Trp 

2765 2770 2775 

TTT TGC CGA ATG TCT TTG TGG AAG TCC ACA CCT ATA CCA GGT AAG TTG 667 2 
Phe Cys Arg Met Ser Leu Trp Lys Ser Thr Pro He Pro Gly Lys Leu 
55 2780 2785 2790 

TGG CAA TAC TTG GAA ATG GGT TGA GTG AAT GTC ACA TGG ATT TTT TAT 67 2 0 
Trp Gin Tyr Leu Glu Met Gly * Val Asn Val Thr Trp He Phe Tyr 
2795 2800 2805 



60 



ATA TAC CAC ATG ATG ATA CAC ATG TAA ATA TAT AAC GAT TAT AGT GTA 67 6 8 
He Tyr His Met Met He His Met * He Tyr Asn Asp Tyr Ser Val 
2810 2815 2820 
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TGC ATA TGC ATT TGG CTA AGA AGT ACT CCC TCC CTT AGT AAA AGT TAG 6 816 
Cys lie Cys lie Trp Leu Arg Ser Thr Pro Ser Leu Ser Lys Ser * 
2825 2830 2835 2840 

5 TAC AAA GTT GAG TCA TCT ATT TTG GAA CGG AGG GAG TAT AAG TGT ATA 68 6 4 
Tyr Lys Val Glu Ser Ser lie Leu Glu Arg Arg Glu Tyr Lys Cys lie 

2845 2850 2855 

CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA ATG AAG GAA 6912 
10 His * Cys Asn lie * Val Leu Thr Pro Asn Leu Pro Met Lys Glu 

2860 2865 2870 

CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GTC TGG TGA ATA ATC CAC 69 6 0 
His Arg Ala Phe * Leu Ser Tyr Leu Phe Val Trp * lie lie His 
15 2875 2880 2885 

TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG GGA GAA GAA ACT ACA 7 00 8 

* Lys lie Pro Ala Met Ser Phe Phe Arg Gly Gly Glu Glu Thr Thr 

2890 2895 2900 

20 

TTG ATT TTT CCC CCT AAA AAA AGC CAT CTC AGA TTT CAT AGG TAA CTT 7 05 6 

Leu lie Phe Pro Pro Lys Lys Ser His Leu Arg Phe His Arg * Leu 

2905 2910 2915 2920 

2 5 GCT TTT CTG TAA AGA AAT GAA AAC GAC TTC ATA CTT TCT GTC GAT TAT 710 4 

Ala Phe Leu * Arg Asn Glu Asn Asp Phe He Leu Ser Val Asp Tyr 

2925 2930 2935 

AAG TGT ATA CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA 715 2 

3 0 Lys Cys He His * Cys Asn He * Val Leu Thr Pro Asn Leu Pro 

2940 2945 2950 

ATG AAG GAA CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GCT GGT GAA 720 0 
Met Lys Glu His Arg Ala Phe * Leu Ser Tyr Leu Phe Ala Gly Glu 
35 2955 2960 2965 

TAA TCC ACT GAA AAA TTC CAG CCA TGT CAT TTT TTA GGG GGG AGA AGA 72 4 8 

Ser Thr Glu Lys Phe Gin Pro Cys His Phe Leu Gly Gly Arg Arg 

2970 2975 2980 

40 

AAC TAT ATT GAT TTT TCC CCC TAA AAA AAG CCA TCT CAG ATT CAT AGG 72 9 6 

Asn Tyr He Asp Phe Ser Pro * Lys Lys Pro Ser Gin He His Arg 

2985 2990 2995 3000 

4 5 AAC TTG CTT TTC TGT AAA GAA ATG AAA ACG ACT TCA TAC TTT CTG CGG 7 3 44 

Asn Leu Leu Phe Cys Lys Glu Met Lys Thr Thr Ser Tyr Phe Leu Arg 

3005 3010 3015 



50 



60 



CGC TTA CTT AGC TCG ATG GAT ATT TGT AAG ATG AAT GCC AAA TTA TTT 73 9 2 
Arg Leu Leu Ser Ser Met Asp He Cys Lys Met Asn Ala Lys Leu Phe 

3020 3025 3030 



GGC GGG ATT TGA TCG TTA TTC CAA ATT TCA TTT GGT TTC TCT AGC AAT 7440 
Gly Gly He * Ser Leu Phe Gin He Ser Phe Gly Phe Ser Ser Asn 
55 3035 3040 3045 



CAA CCC AGT ACC TTG TTA TTG GCA CTG CAA TTT CTT ATT GAT TAA TCA 74 8 8 

Gin Pro Ser Thr Leu Leu Leu Ala Leu Gin Phe Leu He Asp * Ser 
3050 3055 3060 

GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT CAA CTT GGT ATG TGC ACA 7 53 6 

Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr Gin Leu Gly Met Cys Thr 

3065 3070 3075 3080 
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TGA TGG ATT TAC ACT GGG TGA TTT GGT ACA TAT AAT ACC AAG TCA ATT 7 5 84 
* Trp He Tyr Thr Gly * Phe Gly Thr Tyr Asn Thr Lys Ser He 

3085 3090 3095 

5 TAC CAA ATG GGG ' AG A CCA ATA GAG ATG GAG AAA ATC ACA ATC TTA GCT 763 2 
Tyr Gin Met Gly Arg Pro He Glu Met Glu Lys He Thr He Leu Ala 

3100 3105 3110 

GGA ATT GTG GGG AGG TAA TTC TGA ACT CTC CTT TTT TTT TGA AAT TTT 7 68 0 
10 Gly He Val Gly Arg * Phe * Thr Leu Leu Phe Phe * Asn Phe 

3H5 3120 3125 

CAT GCT TTA CAT AAT AGT CAA ATG GCT GAC AAA TGT CGT TGT ATG GTT 772 8 
His Ala Leu His Asn Ser Gin Met Ala Asp Lys Cys Arg Cys Met Val 
15 3130 3135 3140 

CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC 777 6 

Leu Ser Thr * Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 

3145 3150 3155 3160 

20 

TTT GTT CGT ATA ATT GTA TTT TCT AGA GAA AAG TTG CCT TCA ATT TTG 7 82 4 

Phe Val Arg He He Val Phe Ser Arg Glu Lys Leu Pro Ser He Leu 

3165 3170 3175 

2 5 TGC ACG CGG CAG TAC AGG AAT TGT GGT TAT AAA TAT TGA TAC AGG CTG 7 87 2 
Cys Thr Arg Gin Tyr Arg Asn Cys Gly Tyr Lys Tyr * Tyr Arg Leu 

3180 3185 3190 



30 



40 



50 



60 



ACC ATC GTT ACT AAT AGG GGG AAC AAT AAG CAC ATT TTT TTA ATA GCA 7 92 0 
Thr He Val Thr Asn Arg Gly Asn Asn Lys His He Phe Leu He Ala 
3195 3200 3205 



AAG GCA TCA CCC TTG TTC CGT TTC CAA TGA AAT CAC AGT ATC CGA ACC 79 6 8 
Lys Ala Ser Pro Leu Phe Arg Phe Gin * Asn His Ser He Arg Thr 
35 3210 3215 3220 



ATA AGT TTT ACA AGT ATG CGT AGA GAG AAA TAA AGT ATC AAC CCG GCA 8016 

He Ser Phe Thr Ser Met Arg Arg Glu Lys v Ser He Asn Pro Ala 

3225 3230 3235 3240 

GAA ACA GTT GTT TCA GGC GCA AAG AGA AAA GGA AAC GAT ATG CTC TAT 8 064 

Glu Thr Val Val Ser Gly Ala Lys Arg Lys Gly Asn Asp Met Leu Tyr 

3245 3250 3255 

4 5 TAC ATC AAC CTT TTA GCA TTT AGG GAC GAC CAG CAT CAT CCC ATC TTC 8112 

Tyr He Asn Leu Leu Ala Phe Arg Asp Asp Gin His His Pro He Phe 

3260 3265 3270 



AAT CAA CTG GAG CGA GGT CAC CTC CAA TCT TCT CAG CAG CCT CAG AGT 816 0 
Asn Gin Leu Glu Arg Gly His Leu Gin Ser Ser Gin Gin Pro Gin Ser 
3275 3280 3285 



GGT GAC CTC CCA AGC AAG TGC ATC AGC ATC CAT CAT CTG GGG GTT GGG 82 0 8 
Gly Asp Leu Pro Ser Lys Cys lie Ser He His His Leu Gly Val Gly 
55 3290 329S 3300 



CAC ATA CCA TGA GCA CAA TCA CCT GAA TTT GAT GAA TTT TCC TCT GTT 82 5 6 

His He Pro * Ala Gin Ser Pro Glu Phe Asp Glu Phe Ser Ser Val 
3305 3310 3315 3320 

TAC CTT GCA GCA GAC CCC TGC CGT ATA AAT GGT TTT AAA TGA CAG CAT 8 3 04 

Tyr Leu Ala Ala Asp Pro Cys Arg He Asn Gly Phe Lys * Gin His 

3325 3330 3335 
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GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AGA AGC TTT AGA 8 3 52 

Val Leu Ser Val * Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 

3340 3345 3350 

5 ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8 40 0 
lie Met Trp Asn Met His Leu His Phe lie * Gin Tyr Arg Lys Glu 
3355 3360 3365 

AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 8 44 8 
10 Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 
3370 3375 3380 

CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 849 6 
Leu Ser Lys Asp * Gly Arg Gly Arg Cys Ala lie Ser Leu Phe Val 
15 3385 3390 3395 3400 

TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8 54 4 

Ser Trp Phe Leu Lys * Asp Leu Tyr Leu lie Ser Ser lie Phe Glu 

3405 3410 3415 

20 

ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 8 59 2 

lie Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 

3420 3425 3430 

2 5 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 8 64 0 

Pro Ser lie Asn Arg Ala Asn Met Lys Gly Leu Leu lie * Asp lie 
3435 3440 3445 

TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 8688 

3 0 Cys Gin Ser * lie Leu Arg Phe Thr Phe Phe Ser He Ser Asp Leu 

3450 3455 3460 

CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 8736 
Leu Ser lie Phe lie Phe Phe Phe Asn Cys * Gly Val Pro Met Phe 
35 3465 3470 3475 3480 

TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 87 84 

Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 

3485 3490 3495 

40 

TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 8832 

Tyr Cys His Asp Ser Tyr Val Ser Thr lie Trp Ser His lie Val Val 

3500 3505 3510 

45 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 8 8 80 
Leu Ser Asn Tyr Leu Gin lie Phe Ala Phe lie Arg His Gly Ser Ser 
3515 3520 3525 

GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 8928 
50 Val Gly Gin Leu Phe Ser Leu Gly * Lys Arg Thr lie Leu * Leu 
3530 3535 3540 

GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 89 7 6 
Ala Lys lie Leu Leu Pro His Asp Gin lie Pro Gin Val Ser lie Pro 
55 3545 3550 3555 3560 

TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 9 024 
Leu Asn Asn Phe Cys Val Glu Pro Leu Lys Val Pro Pro Asn Ala Lys 

3565 3570 3575 



60 



CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC TAT TTG 907 2 
Arg Ala Arg Ser lie Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 

3580 3585 3590 
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TGT ATT TGA TCT GCT GCA CTG TAG GGA GTG CGA GGG TCT TGG CCT TGA 912 0 

Cys He * Ser Ala Ala Leu * Gly Val Arg Gly Ser Trp Pro * 
3595 3600 3605 

5 GGA CTT TCC AAC GGC CGA ACG GCT GCA GTG GCA TGG TCA TCA GCC TGG 9168 

Gly Leu Ser Asn Gly Arg Thr Ala Ala Val Ala Trp Ser Ser Ala Trp 

3610 3615 3620 

GAA GCC TGA TTG GTC TGA GAA TAG CCG ATT CGT TGC CTT TTC CAT GGT 9216 

10 Glu Ala * Leu Val * Glu * Pro He Arg Cys Leu Phe His Gly 

3625 3630 3635 3640 

ACA CAT ATA GTT CTG AC A CTT CAC TAT AGT TGT TTT AAA AAA GAA AAT 9 2 64 

Thr His He Val Leu Thr Leu His Tyr Ser Cys Phe Lys Lys Glu Asn 

15 3645 3650 3655 

TTA ACT CAA AAG TAA ATT ATG GAG A 9 2 89 

Leu Thr Gin Lys k lie Met Glu 

3660 

20 
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CLAIMS 



1. A nucleic acid sequence encoding an enzyme of the 
starch biosynthetic pathway in a cereal plant, wherein the 

5 enzyme is selected from the group consisting of starch 
branching enzyme I, starch branching enzyme II, starch 
soluble synthase I, and debranching enzyme, with the proviso 
that the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

10 

2. A sequence according to claim 1, wherein the 
sequence is a genomic DNA or cDNA sequence. 



3. A sequence according to claim 1 or claim 2, 

15 wherein the sequence is functional in wheat. 



4. A sequence according to any one of claims 1 to 3 , 

wherein the sequence is derived from a Triticum species. 



20 5. A sequence according to claim 4, wherein the 

Triticum species is Triticum tauschii. 



6. A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes starch branching enzyme I or a 
25 biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO : 5 or SEQ ID NO : 9 . 



7. A sequence according to claim 6, wherein the 

30 homology is at least 90%. 



8. A sequence according to any one of claims 1 to 5, 

wherein the sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 
35 sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO : 10 . 
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9. A sequence according to claim 8, wherein the 
homology is at least 90%. 

10. A sequence according to any one of claims 1 to 5 , 
5 wherein the sequence encodes soluble starch synthase or a 

biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 11 or SEQ ID NO: 13. 

10 11. A sequence according to claim 10, wherein the 

homology is at least 90%. 

12. A sequence according to claim 11, wherein the 
sequence encodes a 75 kD soluble starch synthase of wheat. 

15 

13. A sequence according to claim 12 , which encodes an 
amino acid sequence at least 70% homologous to that shown in 
SEQ ID NO: 14 . 

20 14 . A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes debranching enzyme or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID No : 17 . 

25 

15. A sequence according to claim 14, wherein the 
homology is at least 90%. 

16. A promoter of an enzyme selected from the group 
30 consisting of starch branching enzyme I, starch branching 

enzyme II, starch soluble synthase I, and debranching 
enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize. 

35 

17. A promoter according to claim 16, wherein the 
promoter is a starch branching enzyme I promoter or 
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biologically-active fragment thereof, and wherein the 
promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No : 8 . 

5 18. A sequence according to claim 17, wherein the 

homology is at least 90%. 

19. A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 

10 biologically-active fragment thereof, and wherein the 

promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20. A sequence according to claim 19, wherein the 
15 homology is at least 90%. 

21. A nucleic acid construct comprising a nucleic acid 
sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 

20 nucleic acid sequences facilitating expression of the 

nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

25 the enzyme is not soluble starch synthase I of rice, or 

starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof. 

22. A nucleic acid construct for targeting a gene to 
3 0 the endosperm of a cereal plant, comprising one or more 

promoter sequences selected from the group consisting of 
SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
35 gene in the endosperm of a cereal plant is modified. 



8N80000: <VtfO_0914314A1JL> 



WO 99/14314 



PCT/AU98/00743 



- 119 - 

23. A construct according to either claim 21 or claim 

22, wherein the promoter or nucleic acid sequence is also 
operatively linked to one or more additional targeting 
sequences and/or one or more 3' untranslated sequences. 

24 . A construct according to claim 23 , wherein the 

nucleic acid encoding the protein is either in the sense or 
ant is ens e orientation . 



10 25. A construct according to claims 24, wherein the 

protein is an enzyme of the starch biosynthetic pathway. 

26 . A construct according to claim 25, wherein the 
nucleic acid encoding the protein is in the antisense 

15 orientation, and the enzyme is selected from the group 

consisting of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, and grain softness protein I. 

27. A construct according to claim 25, wherein the 
20 nucleic acid encoding the protein is in the sense 

orientation, and the enzyme is selected from the group 
consisting of bacterial isoamylase, bacterial glycogen 
synthase, and wheat high molecular weight glutenin Bxl7 . 

28. A construct according to any one of claims 21 to 
25 27, wherein the plant is a cereal plant. 

29. A construct according to claim 28, wherein the 
cereal plant is either wheat or barley. 

30 30. A construct according to claim 29, wherein the 

cereal plant is wheat. 

31. A construct according to any one of claims 21 to 

30. wherein the construct is either a plasmid or a vector. 

35 
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32. A construct according to claim 31, wherein the 

plasmid or vector is suitable for use in the transformation 
of a plant . 

5 33 . A construct according to claim 32, wherein the 

plasmid is selected from the group consisting of those 
depicted in Figures 22a to 22f . 

34. A construct according to claim 32, wherein the 
10 vector is a bacterium of the genus Agrobacterium . 

35. A construct according to claim 34, wherein the 
vector is Agrobacterium tumefaciens. 

15 36. A method of modifying the characteristics of 

starch produced by a plant, comprising the steps of: 

(a) introducing a nucleic acid sequence encoding 
an enzyme of the starch biosynthetic pathway into a host 
plant, and/or 

20 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein the enzyme is selected from the group 
consisting of starch branching enzyme I, starch branching 

25 enzyme II, starch soluble synthase I, and debranching 

enzyme, with the proviso that the enzyme is. not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize, and wherein if both steps (a) and (b) are 
used, the enzymes in the two steps are different. 

30 

37. A method according to claim 36, wherein the plant 
is a cereal plant. 

38. A method according to claim 37, wherein the cereal 
35 plant is wheat or barley. 
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39. A method of targeting expression of a gene to the 
endosperm of a cereal plant , comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

5 

40. A method of modulating the time of expression of a 
gene in endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35 . 

10 

41. A method according to claim 40, wherein when 
expression at an early stage following anthesis is desired, 
the construct comprises either the SBE II, SSS I, or DBE 
promoter . 

15 

42. A method according to claim 40, wherein when 
expression at a later stage following anthesis is desired, 
the construct comprises the SBE I promoter. 

20 43. A plant transformed with a construct according to 

any one of claims 21 to 35. 

44. A plant according to claim 43, wherein the plant 
is a cereal plant. 

25 

45. A plant according to claim 44, wherein the cereal 
plant is wheat or barley. 

46. A method of identifying variations in the starch 
30 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence in 
the intron regions of the SBE I, SBE II, SSS I or DBE genes. 

47. A method of identifying variations in the starch 
35 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence 
compared to the sequence shown in one or more SEQ ID NO : 5 , 
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SEQ ID NO: 7, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, or SEQ ID NO : 17 . 

48. A method according to claim 47, in which a 

5 mutation or absence of a SBE I, SBE II, SSS I or DBE gene is 
detected . 

49. A method according to either claim 47 or claim 48, 
in which the cereal plant is wheat or barley. 

10 50. A product comprising plant material propogated 

from a plant transformed with a nucleic acid sequence 
encoding an enzyme of the starch biosynthetic pathway in a 
cereal plant, operably linked to one or more nucleic acid 
sequences facilitating expression of the nucleic acid 

15 sequence in a plant, wherein the enzyme is selected from the 
group consisting of starch branching enzyme I, starch 
branching enzyme II, starch soluble synthase I, and 
debranching enzyme, with the proviso that the enzyme is not 
soluble starch synthase I of rice, or starch branching 

20 enzyme I of rice or maize, a biologically-active fragment 
thereof . 

51. A product comprising plant material propogated 
from a plant in which a gene was targeted to the endosperm 
of a cereal plant, by a nucleic acid construct comprising 

25 one or more promoter sequences selected from the group 
consisting of SBE I promoter, SBE II promoter, SSS I 
promoter, and DBE promoter, operatively linked to a nucleic 
acid sequence encoding a protein, wherein the expression of 
the targetted gene in the endosperm of a cereal plant is 

30 modified. 

52. A product according to claim 50 or claim 51 
wherein the product is a food product. 
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Figure 20b 
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FIGURE 2 IB 
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Primer 
Set 


Key 


Forward 
Primer 


Forward Primer Sequence 


1 


E01 ' /E02 


WBE2E1F 


CGT CGC TGC TCC TCA GGA AG 


2 


E01/E02 


sr854 . 1180F 


CTG GCT GAG TCA ATC ACT ACG 


3 


E02 /E03 


WBE2E2F 


CGC AAC CTG AAG AAT TAC AG 


4 


E03 /E04 


WBE2E3F 


ATT TTC GGA GCC ATC TTG AC 


5 


E04 /E05 


WBE2E4F 


TCG TGG TTA TGA AAA GCT TGG 


6 


E05/E06 


sr913F 


ATC ACT TAC CGA GAA TGG G 


7 


E05/I05 


sr913F 


ATC ACT TAC CGA GAA TGG G 


8 


E06 /E07 


WBE2E6F 


ACA ATT GGA ATC CAA ATG CA 


9 


E07 /E08 


WBE2E7F 


AGC TAT TCC TCA TGG CTC AC 


10 


E08/E09 


WBE2E8F 


TGC AGG CTC CAG GTG AAA TA 


11 


E10/E11 


da5 . seq 


GGC TTG GAT ACA ATG CAG TGC 


12 


E12 /E13 


dal 5 1 . seq 


TTG ACG GCT TGA ATG GTT TC 


13 


E17 /E18 


WBE2E17F 


TTT AGG TGG TGA AGG CTA TCT 


14 


E18/E19 


sr860R 


AAT GGA TAG ATT TTC CAA GAG G 


15 


E19_3 ' 


WBE2-2395F 


AGC AGA ACT GCG GTC GTG TA 



Reverse 
Primer 


Reverse Primer Se qiaen.ee 


Temp 


bp 


WBE2E2R 


CAG GAC CTT CCC TGG AGA GG 


57 . 4 


401 


WSBE9E2R 


GGC ACG AGT GTG TGT ACC TGT A 


57 . 7 


601 


sr866F 


TAT CTT CAG GTA TCT ACA GC 


49 . 8 


309 


WBE2E4R2 


ATG CTT CCA ATC CAC CTT CA 




>450 


WBE2E5R 


GAG CCC ATT CTC GGT AAG TGA 


50 . 5 
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CTG CAT TTG GAT TCC AAT TG 


49 . 9 
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46 . 6 
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WBE2E13R 


ATT CTT CAA GCC ACC ATC TC 
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50 . 2 


258 


da23 . seq 


TGC TGC ATT GCC TGA TCG AA 


50.4 


-2 9 5 


WBE2-2634R 


AAC ACC CAG GCC CGT CCA TT 


57 .2 


240 
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SBE II Intron 10 primer set - digested with Dde1 
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